What is happening in each iteration of PSI-BLAST?

What is happening in each iteration of PSI-BLAST?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I understand that the first BLAST yields almost the same results as blastp. The second time the iterated blast generates different results as it uses different matrix based on our first result.

But I don't understand how exactly the second matrix is generated.

Addition to previous answers:

PSI-BLAST is a sort of machine learning algorithm which uses the results of the first alignment (PSSM) to score the next iteration of alignment. I would recommend you to refer to the NCBI bookshelf page on PSI-BLAST.

PSI-BLAST adopts a scoring scheme (PSSM) that is built based on a given set of data (the aligned sequences), rather than using a generalized scoring matrix. This improvisation updates the prior knowledge of the homologs and helps to detect similar sequences that were otherwise undetectable. With iterations the PSSM keeps getting updated, thereby, making the BLAST more sensitive in finding homologs.

How PSSM is constructed:

If a position in the alignment is conserved (i.e same residue in many sequences), it receives a high score whereas a position with low conservation gets a low score.

Score is based on relative counts of a residue in a position. If you want to know the details see here.

PSI-BLAST is Position-Specific Iterated BLAST.

I'm probably out of my depth here, but my understanding is that the results of the first, essentially standard, BLAST search are used to create a multiple sequence alignment. This alignment is then converted to a Position Specific Scoring Matrix (PSSM). (The linked Wikipedia entry shows this for DNA sequences, but the principle is exactly the same for protein sequences.) This matrix can then be used as the probe for the next round of BLAST, and the results of that alignment can be used judiciously to improve the matrix, and so on until the results converge.

Here is a link to some slides by Stephen Altschul which go into lots of detail.

From Russ Altman (Stanford University) lecture:

Blast takes as an input a sequence, it searches the database, and outputs a set of alignments (a top ranked alignment, followed by a second-top ranked, a third, and so on). Then you take the top scoring sequences (you might use the E-value as a cutoff) and you can create a position-specific scoring matrix (PSSM) based on the occurrences of amino acids in all the columns of these alignments. This is allowed, because there is a high chance that these sequences are homologous to the query sequence and thus looking at the amino acids would give you a good sense of the variability in position that is allowed in the initial query sequence. The next step is to re-search the database for more hits. So now instead of searching with a sequence, you are searching with a profile, and you get another set of hits. You can iterate a few times which would give you a much more sensitive search.

Polymerase chain reaction

Polymerase chain reaction (PCR) is a method widely used to rapidly make millions to billions of copies (complete copies or partial copies) of a specific DNA sample, allowing scientists to take a very small sample of DNA and amplify it (or a part of it) to a large enough amount to study in detail. PCR was invented in 1983 by the American biochemist Kary Mullis at Cetus Corporation. It is fundamental to many of the procedures used in genetic testing and research, including analysis of ancient samples of DNA and identification of infectious agents. Using PCR, copies of very small amounts of DNA sequences are exponentially amplified in a series of cycles of temperature changes. PCR is now a common and often indispensable technique used in medical laboratory research for a broad variety of applications including biomedical research and criminal forensics. [1] [2]

The majority of PCR methods rely on thermal cycling. Thermal cycling exposes reactants to repeated cycles of heating and cooling to permit different temperature-dependent reactions – specifically, DNA melting and enzyme-driven DNA replication. PCR employs two main reagents – primers (which are short single strand DNA fragments known as oligonucleotides that are a complementary sequence to the target DNA region) and a DNA polymerase. In the first step of PCR, the two strands of the DNA double helix are physically separated at a high temperature in a process called nucleic acid denaturation. In the second step, the temperature is lowered and the primers bind to the complementary sequences of DNA. The two DNA strands then become templates for DNA polymerase to enzymatically assemble a new DNA strand from free nucleotides, the building blocks of DNA. As PCR progresses, the DNA generated is itself used as a template for replication, setting in motion a chain reaction in which the original DNA template is exponentially amplified.

Almost all PCR applications employ a heat-stable DNA polymerase, such as Taq polymerase, an enzyme originally isolated from the thermophilic bacterium Thermus aquaticus. If the polymerase used was heat-susceptible, it would denature under the high temperatures of the denaturation step. Before the use of Taq polymerase, DNA polymerase had to be manually added every cycle, which was a tedious and costly process. [3]

Applications of the technique include DNA cloning for sequencing, gene cloning and manipulation, gene mutagenesis construction of DNA-based phylogenies, or functional analysis of genes diagnosis and monitoring of genetic disorders amplification of ancient DNA [4] analysis of genetic fingerprints for DNA profiling (for example, in forensic science and parentage testing) and detection of pathogens in nucleic acid tests for the diagnosis of infectious diseases.

Spatial: The Next Omics Frontier

Gene expression changes throughout a region of tissue taken from a patient with colorectal cancer are analyzed using NanoString Technologies’ GeoMx Digital Spatial Profiler. Up until now, the GeoMX has made it possible to analyze 1,800 genes using the Cancer Transcriptome Atlas. Recently, NanoString announced a Whole Transcriptome Atlas service which provides an unbiased view of 18,000-plus protein-coding genes.

Medicine is moving from very blunt instruments to molecules that are “the finest scalpel you could ever have.” So says George Church, PhD, professor of genetics at Harvard Medical School (HMS), who has a pretty decent track record when it comes to appraising new genomics technologies. This evolution, Church notes in a Wyss Institute video, has created the need for observational tools that allow researchers to “see at that high level of resolution and comprehensiveness.”

Church is referring to the recent development of spatial transcriptomic technology (known simply as “spatial”). Until now, single-cell sequencing techniques, such as RNA sequencing (RNA-seq), have been limited to tissue-dissociated cells, that is, cells extracted from ground-up tissue. Such cells lose all spatial information.

Spatial transcriptomics gives a rich, spatial context to gene expression. By marrying imaging and sequencing, spatial transcriptomics can map where particular transcripts exist on the tissue, indicating where particular genes are expressed.

The understanding of the spatial context of biology in human specimens has always been critical to our understanding of human disease, notes Church’s HMS colleague David Ting, MD, assistant professor in medicine. Interpretation of histology, he adds, is still an art that makes the anatomic pathologist indispensable for the accurate diagnosis of disease. But these new spatial technologies will help unlock a deeper understanding of what is happening in tissue, which will be applicable to most areas of biomedical research.

With spatial transcriptomics, “not only can you characterize what is there, but now you can go one step further to see how cells are interacting,” notes Elana J. Fertig, PhD, associate professor of oncology, applied mathematics and statistics, and biomedical engineering, Johns Hopkins University.

“We used to attempt to do this in cancer biology, using laser capture microdissection and bulk RNA-seq data,” Fertig adds, “by cutting out regions and profiling them.” But those expedients still averaged gene expression data from individual cells, losing fine-grained information about cell-state transitions that could have been used to clarify cell-to-cell interactions.

Now, contextual information that used to slip away can be captured with spatial transcriptomics. “We see not only where the immune cells are,” Fertig explains, “but what states they are in, and the interaction between the tumor and immune system.” It is a true combination of cellular data and molecular genomics which, she insists, is needed to improve outcomes in cancer biology.

No stranger to the development of new sequencing technologies, Joe Beechem, PhD, the chief scientific officer and senior vice president of research and development at NanoString Technologies, moved from a substantial academic career to industry about 20 years ago. A pioneer of some of the first next-generation sequencing (NGS) instruments, Beechem tells GEN that it is fun to see biology continually reinventing itself in these new ways.

He maintains the way the spatial transcriptomics field is breaking feels “almost identical” to the rush of building the first NGS instruments. “This time,” he declares, “is as big—if not bigger.”

Spatial, meet COVID-19

Researchers are diving into COVID-19 research head first, working at full throttle to elucidate the effect of the SARS-CoV-2 virus on its human host, with many focused naturally on the lung. To understand changes in discrete areas of host tissue, some researchers are turning to spatial transcriptomics.

For example, a group of researchers at Massachusetts General Hospital, including Ting, used spatial transcriptomics, among other techniques, to analyze autopsy specimens from 24 patients who succumbed to COVID-19. The work, notes Ting, has expanded our understanding of SARS-CoV-2 infection “through the lens of histological findings identified by pathologists.”

First, Ting’s group pinpointed the location of the virus using RNA in situ hybridization in lung tissue. Using NanoString’s GeoMx Digital Spatial Profiler, the group was able to analyze the transcriptional and proteomic changes in these areas. The findings corresponded to distinct spatial expression of interferon response genes and immune checkpoint genes, demonstrating the intrapulmonary heterogeneity of SARS-CoV-2 infection.

According to Ting, the data revealed a tremendous interferon response specifically to the regions containing SARS-CoV-2 viral RNA, indicating that this is the dominant immunological response to the virus. In addition, protein analysis in this same region showed upregulation of immune regulatory molecules including PD-L1, CTLA4, and IDO1, which are known to be T-cell suppressive in the context of cancer.

The results have been posted as a preprint, “Temporal and Spatial Heterogeneity of Host Response to SARS-CoV-2 Pulmonary Infection,” on the medRxiv preprint server.

Sarah Warren, PhD, NanoString’s director of translational science, tells GEN that GeoMX was well positioned to support COVID-19 research because the platform had already been established to work with formalin-fixed paraffin embedded (FFPE) specimens—samples of the type being provided by COVID-19 autopsies. Warren asserts that spatial can be implemented to understand the diversity of ways in which SARS-CoV-2 is impacting the different organ systems, which is, she emphasizes, “something that cannot be done with any other platform.”

Another group of researchers is utilizing spatial transcriptomics to probe COVID-19 patient lung tissue. A recent talk from the NIH COVID-19 Scientific Interest Group (SIG) given by Aviv Regev, PhD, former core member at the Broad Institute and recently appointed head of San Francisco–based Genentech Research and Early Development (gRED), highlighted her laboratory’s work to understand the relationship between the virus and host by performing spatial transcriptomics.

Regev presented data from FFPE specimens from the trachea and left upper lobe from a COVID-19 patient. By using the GeoMx to analyze 1,800 RNA targets, including key genes for cells that are infected by the virus, her team was able to analyze and compare the RNA expression of SARS-CoV-2-infected cells versus neighboring, uninfected cells.

According to Regev, images of the FFPE specimens are “terrifying” because they depict how aggressively the virus proliferates and ravages the lungs. But she also noted that the infection “is specific to some lung regions, while other lung regions remain untouched.” Such patterns offer clues to SARS-CoV-2 infection that may help in the battle to stop the pandemic.

A consortium and the next Chromium

When 10x Genomics rolled out its single-cell sequencing platform, the Chromium, the company kept asking its customers what the single-cell world needed. It was by talking to customers at Human Cell Atlas meetings that 10x Genomics fully realized the excitement behind spatial transcriptomics technology.

Ben Hindson, PhD, chief scientific officer and president of 10x Genomics, tells GEN that customer feedback led the company to acquire Spatial Transcriptomics, a Swedish company that took what Hindson describes as a nice, scalable approach. 10x Genomics then incorporated Spatial Transcriptomics’ technology into the Visium Spatial Gene Expression Solution, a product that began shipping in November 2019. Hindson says that adoption has been tremendous since then, and that 10x Genomics is enhancing Visium so that it may use fluorescently labeled antibodies and detect proteins.

A nice thing about Visium, notes Hindson, is that a lot of big equipment is unnecessary. It’s just a microscope slide and a reagent kit. The slide has 5,000 regions, each of which can capture RNA from 1–10 cells at a time. The Visium takes an unbiased approach to profiling the cells, something that Johns Hopkins’ Fertig really appreciates.

Fertig, who uses Visium in her cancer research lab, is a member of the 10x Genomics Visium Clinical Translational Research Network (CTRN). The newly formed group (which has yet to have its first Zoom meeting) has been formed across different disciplines. Fertig says it is really exciting to have researchers from different fields focused on implementing this technology in the clinic.

She explains that the commonalities that exist across fields enable “team science” across disciplines. The CTRN, she adds, gives the researchers a “chance to think in a similar headspace” and suggests “new ways to translate biological discoveries into therapies.” She emphasizes that the CTRN fosters a new research model. Ordinarily, research consortia are centered around a common disease. The CTRN, however, is focused on how a common technology may drive progress against multiple diseases.

Teamwork makes the dream work

The GeoMX and Visium platforms may make spatial transcriptomic experiments more accessible for researchers, but challenges still exist. The first challenge is knowing which areas to study. Without some direction into the region or cells to study, it is nearly impossible, Ting tells GEN, to determine if differences are based on the same cell types in different locations or if they are just driven by the wide variety of cell types inherent to human tissue.

The second challenge is interpreting the increasing amounts of transcriptional and proteomic data. Here, again, dealing with the challenge is a matter of knowing where to look—or asking the right question. “It’s more about knowing what constellations you are looking for in the sky,” Ting explains, “rather than staring at the entire night sky and trying to understand the entire galaxy.”

Yet another challenge is the diversity of skills needed to run a successful spatial transcriptomics experiment. One kind of expertise is needed to obtain the best sample from the most appropriate portion of the tumor another is needed to implement the long protocol and yet another is needed to interpret the data.

Coordinating all these kinds of expertise can realize the team science approach described by Fertig. She admits that if scientific teams are to succeed, significant investments in time and effort are needed because “different layers of skills” are required. She adds, however, that the investments can be worthwhile. Not only can the team approach be a very effective way to do science, it can be critical, she stresses, for favorable outcomes.

Neither first nor last

Several spatial transcriptomics technologies are in development. One iteration is known as fluorescent in situ sequencing (FISSEQ). It was first proposed in 2003 by the Church lab, which published a realization of the FISSEQ concept in a 2014 Science paper. The technology was further developed at the Wyss Institute and is now being commercialized by the startup company Readcoor, which provides FISSEQ-based instruments, kits, and software.

A fluorescent in situ sequencing (FISSEQ) technology developed at the Wyss Institute can present 3D images of molecular targets in tissue or cell samples. The technology allows mRNA to stay fixed in its original location while it is converted into DNA amplicons. Then, DNA sequences are revealed using fluorescent dyes and a super-resolution microscope. The technique allows for panomic spatial sequencing that captures cell morphology.

Last February, Readcoor staged a splashy launch of its FISSEQ platform at the 2020 Advances in Genome Biology and Technology meeting. Company representatives announced a Select Release Program and even distributed T-shirts displaying George Church’s face. Since then, the platform’s rollout has been overtaken by events, namely, the disruptions due to the COVID-19 pandemic.

Readcoor tells GEN that the pandemic has “thrown a wrench” into the rollout by delaying or otherwise complicating the company’s installation plans. The size of the wrench is unclear, as Readcoor did not reveal how many customers are still waiting on installation, or even to speak to a more general point, namely, the number of customers that agreed to participate in the Select Release Program. Another area where Readcoor is quiet is on the cost of its platform. By comparison, NanoString has sold over 125 GeoDX systems (with just over 70 installed) at $300,000 per system.

But Evan Daugharthy, PhD, ReadCoor’s vice president of science, is optimistic that the company will “turn a corner soon and be able to deliver our instruments to our customers.” Readcoor affirms that it is on track for full commercial launch of the platform in 2021.

Readcoor is not the first company out of the gate, and it will certainly not be last. Rest assured, Beechem says, there will be multiple new companies every year—”ankle biters,” he calls them—trying to gain a foothold in this space. But Beechem warns that it takes a long time to build a platform. You not only have to have a chemistry that works to create a spatial platform, you also need to marry an imaging platform and a sequencer, which is a multidimensional challenge. Five years ago, he explains, he was spending 90% of his time building the technology.

Regardless of whether the spatial pioneers or the new and emerging companies mentioned in this article are successful, the promise of spatial transcriptomics is profound. This “could lead to a new era,” asserts Church, because it affords investigation into “comprehensive expression and relationships among cells over vast spatial distances.” When you look at the whole set, he notes, you find that the “the place where you didn’t think to look” is way out of proportion. Through comprehensive transcript surveys, there is a better chance of finding the one transcript that is causative—a potential weak point—where a therapeutic might just work.

Two decades after we celebrated our first detailed map of the human genome, it seems that spatial is the next exciting frontier for genomics to conquer.

SSH causes while loop to stop

I have finally managed to boil down a problem I have been struggling with for a few weeks. I use SSH with "authorized keys" to run commands remotely. All is fine except when I do it in a while loop. The loop terminates after completing any iteration with an ssh command.

For a long time I thought this was some kind of ksh weirdness, but I now discovered bash does in fact behave identically.

A small sample program to reproduce the problem. This is distilled from a larger implementation which takes snapshots and replicates them amongst the nodes in a cluster.

(Note there is a TAB character in the grep search expression as per the definition of the behaviour of the zfs list "-H" option.)

My sample have some ZFS filesystems for the root where all the "zones" have their root file system on a dataset named similar to

The above loop should create a snapshot for each of the selected datasets, but in stead it operates only on the first one and then exits.

That the program finds the right number of datasets can be easily confirm by checking the "/tmp/actionlist" file after the script exists.

If the ssh command is replaced by, for example, an echo command, then the loop iterates through all the input lines. Or my favourite - prepend "echo" to the offending command.

If I use a for loop in stead then it also works, but due to the potential size of the list of datasets this could cause problems with the maximum expanded command line length.

I am now 99.999% sure that only those loops with ssh commands in them give me problems!

Note that the iteration in which the ssh command runs, completes! It is as if the data pipped in to the while loop is suddenly lost. If the first few input lines don't perform an ssh command, then the loop goes on until it actually runs the SSH command.

On my laptop where I am testing this I have two Solaris 10 VMs with only about two or three sample datasets, but the same is happening on the large SPARC systems where this is meant to go live, and there are many datasets.


Single-cell RNA expression analysis (scRNA-seq) is revolutionizing whole-organism science [1, 2] allowing the unbiased identification of previously uncharacterized molecular heterogeneity at the cellular level. Statistical analysis of single-cell gene expression profiles can highlight putative cellular subtypes, delineating subgroups of T cells [3], lung cells [4] and myoblasts [5]. These subgroups can be clinically relevant: for example, individual brain tumors contain cells from multiple types of brain cancers, and greater tumor heterogeneity is associated with worse prognosis [6].

Despite the success of early single-cell studies, the statistical tools that have been applied to date are largely generic, rarely taking into account the particular structural features of single-cell expression data. In particular, single-cell gene expression data contain an abundance of dropout events that lead to zero expression measurements. These dropout events may be the result of technical sampling effects (due to low transcript numbers) or real biology arising from stochastic transcriptional activity (Fig. 1 a). Previous work has been undertaken to account for dropouts in univariate analysis, such as differential expression analysis, using mixture modeling [7, 8]. However, approaches for multivariate problems, including dimensionality reduction, have not yet been considered. As a consequence, it has not been possible to ascertain fully the ramifications of applying dimensionality-reduction techniques, such as principal components analysis (PCA), to zero-inflated data.

Zero-inflation in single-cell expression data. a Illustrative distribution of expression levels for three randomly chosen genes showing an abundance of single cells exhibiting null expression [15]. b Heat maps showing the relationship between dropout rate and mean non-zero expression level for three published single-cell data sets [3, 5, 14] including an approximate double exponential model fit. c Flow diagram illustrating the data generative process used by ZIFA. d Illustrative plot showing how different values of λ in the dropout-mean expression relationship (blue lines) can modulate the latent gene expression distribution to give a range of observed zero-inflated data

Dimensionality reduction is a universal data-processing step in high-dimensional gene expression analysis. It involves projecting data points from the very high-dimensional gene expression measurement space to a low-dimensional latent space reducing the analytical problem from a simultaneous examination of tens of thousands of individual genes to a much smaller number of (weighted) collections that exploit gene co-expression patterns. In the low-dimensional latent space, it is hoped that patterns or connections between data points that are hard or impossible to identify in the high-dimensional space will be easy to visualize.

The most frequently used technique is PCA, which identifies the directions of largest variance (principal components) and uses a linear transformation of the data into a latent space spanned by these principal components. The transformation is linear as the coordinates of the data points in the low-dimensional latent space are a weighted sum of the coordinates in the original high-dimensional space and no non-linear transformations are used. Other linear techniques include factor analysis (FA), which is similar to PCA but focuses on modeling correlations rather than covariances. Many non-linear dimensionality techniques are also available but linear methods are often used in an initial step in any dimensionality-reduction processing since non-linear techniques are typically more computationally complex and do not scale well to simultaneously handling many thousands of genes and samples.

In this article, we focus on the impact of dropout events on the output of dimensionality-reduction algorithms (principally linear approaches) and propose a novel extension of the framework of probabilistic principal components analysis (PPCA) [9] or FA to account for these events. We show that the performance of standard dimensionality-reduction algorithms on high-dimensional single-cell expression data can be perturbed by the presence of zero-inflation making them suboptimal. We present a new dimensionality-reduction model, zero-inflated factor analysis (ZIFA), to account explicitly for the presence of dropouts. We demonstrate that ZIFA outperforms other methods on simulated data and single-cell data from recent scRNA-seq studies.

The fundamental empirical observation that underlies the zero-inflation model in ZIFA is that the dropout rate for a gene depends on the expected expression level of that gene in the population. Genes with lower expression magnitude are more likely to be affected by dropout than genes that are expressed with greater magnitude. In particular, if the mean level of non-zero expression (log read count) is given by μ and the dropout rate for that gene by p 0, we have found that this dropout relationship can be approximately modeled with a parametric form p 0= exp(−λ μ 2 ), where λ is a fitted parameter, based on a double exponential function. This relationship is consistent with previous investigations [7] and holds in many existing single-cell data sets (Fig. 1 b), including a data set with unique molecular identifiers [10] (Additional file 1: Figure S1). The use of this parametric form permits fast, tractable linear algebra computations in ZIFA enabling its use on realistically sized data sets in a multivariate setting.

Cellular players get their moment in the limelight

Credit: Whitehead Institute for Biomedical Research

In order to understand our biology, researchers need to investigate not only what cells are doing, but also more specifically what is happening inside of cells at the level of organelles, the specialized structures that perform unique tasks to keep the cell functioning. However, most methods for analysis take place at the level of the whole cell. Because a specific organelle might make up only a fraction of an already microscopic cell's contents, "background noise" from other cellular components can drown out useful information about the organelle being studied, such as changes in the organelle's protein or metabolite levels in response to different conditions.

Whitehead Institute Member David Sabatini and Walter Chen, a former graduate student in Sabatini's lab and now a pediatrics resident at Boston Children's Hospital and Boston Medical Center and a postdoctoral researcher at Harvard Medical School, developed in recent years a method for isolating organelles for analysis that outstrips previous methods in its ability to purify organelles both rapidly and specifically. They first applied the method to mitochondria, the energy-generating organelles known as the "powerhouses of the cell," and published their study in Cell in 2016. Subsequently, former Sabatini lab postdoctoral researcher Monther Abu-Remaileh and graduate student Gregory Wyant applied the method to lysosomes, the recycling plants of cells that break down cell parts for reuse, as described in the journal Science in 2017. In collaboration with former Sabatini lab postdoctoral researcher Kivanc Birsoy, Sabatini and Chen next developed a way to use the mitochondrial method in mice, as described in PNAS in 2019. Now, in a paper published in iScience on May 22, Sabatini, Chen, and graduate student Jordan Ray have extended the method for use on peroxisomes, organelles that play essential roles in human physiology.

"It's gratifying to see this toolkit expand so we can use it to gain insight into the nuances of these organelles' biology," Sabatini says.

Using their organellar immunoprecipitation techniques, the researchers have uncovered previously unknown aspects of mitochondrial biology, including changes in metabolites during diverse states of mitochondrial function. They also uncovered new aspects of lysosomal biology, including how nutrient starvation affects the exchange of amino acids between the organelle and the rest of the cell. Their methods could help researchers gain new insights into diseases in which mitochondria or lysosomes are affected, such as mitochondrial respiratory chain disorders, lysosomal storage diseases, and Parkinson's Disease. Now that Sabatini, Chen, and Ray have extended the method to peroxisomes, it could also be used to learn more about peroxisome-linked disorders.

Developing a potent method

The researchers' method is based on "organellar immunoprecipitation," which utilizes antibodies, immune system proteins that recognize specific perceived threats that they are supposed to bind to and help remove from the body. The researchers create a custom tag for each type of organelle by taking an epitope, the section of a typical perceived threat that antibodies recognize and bind to, and fusing it to a protein that is known to localize to the membrane of the organelle of interest, so the tag will attach to the organelle. The cells containing these tagged organelles are first broken up to release all of the cell's contents, and then put in solution with tiny magnetic beads covered in the aforementioned antibodies. The antibodies on the beads latch onto the tagged organelles. A magnet is then used to collect all of the beads and separate the bound organelles from the rest of the cellular material, while contaminants are washed away. The resulting isolated organelles can subsequently be analyzed using a variety of methods that look at the organelles' metabolites, lipids, and proteins.

With their method, Chen and Sabatini have developed an organellar isolation technique that is both rapid and specific, qualities that prior methods have typically lacked. The workflow that Chen and Sabatini developed is fast—this new iteration for peroxisomes takes only 10 minutes to isolate the tagged organelles once they have been released from cells. Speed is important because the natural profile of the organelles' metabolites and proteins begins to change once they are released from the cell, and the longer the process takes, the less the results will reflect the organelle's native state.

"We're interested in studying the metabolic contents of organelles, which can be labile over the course of an isolation," Chen says. "Because of their speed and specificity, these methods allow us to not only better assess the original metabolic profile of a specific organelle but also study proteins that may have more transient interactions with the organelle, which is very exciting."

Peroxisomes take the limelight

Peroxisomes are organelles that are important for multiple metabolic processes and contribute to a number of essential biological functions, such as producing the insulating myelin sheaths for neurons. Defects in peroxisomal function are found in various genetic disorders in children and have been implicated in neurodegenerative diseases as well. However, compared to other organelles such as mitochondria, peroxisomes are relatively understudied. Being able to get a close-up look at the contents of peroxisomes may provide insights into important and previously unappreciated biology. Importantly, in contrast to traditional ways of isolating peroxisomes, the new method that Sabatini, Chen, and Ray have developed is not only fast and specific, but also reproducible and easy to use.

"Peroxisomal biology is quite fascinating, and there are a lot of outstanding questions about how they are formed, how they mature, and what their role is in disease that hopefully this tool can help elucidate," Ray says.

An exciting next step may be to adapt the peroxisome isolation method so it can be used in a mammaliam model organism, such as mice, something the researchers have already done with the mitochondrial version.

"Using this method in animals could be especially helpful for studying peroxisomes because peroxisomes participate in a variety of functions that are essential on an organismal rather than cellular level," Chen says. Going forward, Chen is interested in using the method to profile the contents of peroxisomes in specific cell types across a panel of different mammalian organs.

While Chen sets out to discover what unknown biology the peroxisome isolation method can reveal, researchers in Sabatini's lab are busy working on another project: extending the method to even more organelles.

Erol C. Bayraktar et al. MITO-Tag Mice enable rapid isolation and multimodal profiling of mitochondria from specific cell types in vivo, Proceedings of the National Academy of Sciences (2018). DOI: 10.1073/pnas.1816656115

What is Ksp?

#K_(sp)# is called solubility product constant, or simply solubility product. In general, the solubility product of a compound represents the product of molar concentrations of ions raised to the power of their respective stoichiometric coefficients in the equilibrium reaction.

Here's an example to better demonstrate the concept. Let's consider the saturated solution of silver chloride ( #AgCl# ), where an equilibrium exists between the dissolved ions and undissolved silver chloride according to the following reaction:

#AgCl_((s)) rightleftharpoons Ag_((aq))^(+) + Cl_((aq))^(-)#

Since this is an equilibrium reaction, we can write the equilibrium constant for it:

#K = ([Ag^(+)]*[Cl^(-)])/([AgCl])# . Now, the concentrations of solids are either unknown or assumed to be constant, so this reaction becomes

The magnitude of #K_(sp)# directly indicates the solubility of the salt in water, since #K_(sp)# is derived from the concentrations of ions in equilibrium reactions. Thus, higher concentrations of ions mean greater solubility of the salt.

When trying to write the equation for #K_(sp)# , you need to know how to break the compound into ions (identify the monoatomic and polyatomic ions), how many moles of each ion are formed, and the charge on each ion.

Using ask and with

The previous examples of ask have been applied to all the turtles or patches in the model. However, most of the time it is more useful to execute a command on a smaller group. To do this, we can use the with command.

Here, instead of giving all the turtles to the ask command, we give it the group of turtles who have a value of true stored in their variable called sick (this is a special variable created specifically for the 'virus' model).

Visually, these two commands are constructed like so:

Finally, we will use a combination of ask and some other commands to do something more interesting than changing colours.

You can probably see that the first command ( ask turtles [ facexy 0 0 ] ) tells each turtle to spin round and face the coordinate (0,0) (which happens to be in the middle of the world in this model). The commands that follow ( ask turtles [ forward 1 ] ) tell the agents to move forward one step in the direction that they are facing. This might seem trivial, but you now have covered the main commands that you need to create an agent-based model!

Remember, information about all the different commands that are available can be found in the NetLogo documentation. In particular, the NetLogo Dictionary lists every command that is available.

The merging of humans and machines is happening now

The merging of machine capability and human consciousness is already happening. Writing exclusively for WIRED, DARPA director Arati Prabhkar outlines the potential rewards we face in the future - and the risks we face

Peter Sorger and Ben Gyori are brainstorming with a computer in a laboratory at Harvard Medical School. Their goal is to figure out why a powerful melanoma drug stops helping patients after a few months. But if their approach to human-computer collaboration is successful, it could generate a new approach to fundamentally understanding complexities that may change not only how cancer patients are treated, but also how innovation and discovery are pursued in countless other domains.

At the heart of their challenge is the crazily complicated hairball of activity going on inside a cancer cell - or in any cell. Untold thousands of interacting biochemical processes, constantly morphing, depending on which genes are most active and what's going on around them. Sorger and Gyori know from studies of cells taken from treated patients that the melanoma drug's loss of efficacy over time correlates with increased activity of two genes. But with so many factors directly or indirectly affecting those genes, and only a relatively crude model of those global interactions available, it's impossible to determine which actors in the cell they might want to target with additional drugs.

That's where the team's novel computer system comes in. All Sorger and Gyori have to do is type in a new idea they have about the interactions among three proteins, based on a mix of clinical evidence, their deep scientific expertise, and good old human intuition. The system instantly considers the team's thinking and generates hundreds of new differential equations, enriching and improving its previous analytical model of the myriad activities inside drug-treated cells. And then it spits out new results.

These don't predict all the relevant observations from tumour cells, but it gives the researchers another idea involving two more proteins - which they shoot back on their keyboard. The computer churns and responds with a new round of analysis, producing a model that, it turns out, predicts exactly what happens in patients and offers new clues about how to prevent some cases of melanoma recurrence.

In a sense, Sorger and Gyori do what scientists have done for centuries with one another: engage in ideation and a series of what-ifs. But in this case, their intellectual partner is a machine that builds, stores, computes and iterates on all those hundreds of equations and connections.

The combination of insights from the researchers and their computer creates a model that does not simply document correlations - "When you see more of this, you'll likely see more of that" - but rather starts to unveil the all-important middle steps and linkages of cause and effect, the how and why of molecular interactions, instead of just the what. In doing so, they make a jump from big data to deep understanding.

More than 3,220km away, another kind of human-machine collaboration unfolds at the University of Utah as Greg Clark asks Doug Fleenor to reach out and touch the image of a wooden door on a computer monitor.

Clark knows that Fleenor cannot physically touch this or any other object Fleenor lost both his hands in a near-fatal electrical accident 25 years ago. But Fleenor's arm has a chip in it that communicates with the computer, so when he moves his arm the image of a hand on the monitor also moves. He's done this before - raising his arm, watching the cartoon hand move in sync and seemingly stroke the face of the door - but this time it's different. He lurches back and gasps. "That is so cool!" he blurts.

What's so cool is that as he guides his virtual hand across that virtual plank, he literally, biologically and neurologically, feels its wooden surface. Thanks to some new software and an array of fine electrical connections between another embedded chip and the nerves running up his arm to his brain, he experiences a synthesised sensation of touch and texture indistinguishable from a tactile event.

For someone who hasn't actually touched anything with his hands for a quarter of a century, this is a transcendent moment - one that points to a remarkable future that is now becoming real… and in Fleenor's case, even tangible.

In ways as diverse as a shared understanding of causal complexity as in Peter Sorger's lab and the seamless commingling of software and wetware as in Greg Clark's lab, it's a future in which humans and machines will not just work side by side, but rather will interact and collaborate with such a degree of intimacy that the distinction between us and them will become almost imperceptible.

"We and our technological creations are poised to embark on what is sure to be a strange and deeply commingled evolutionary path" Arati Prabhakar, DARPA's former director

Building on adaptive signal processing and sensitive neural interfaces, machine reasoning and complex systems modelling, a new generation of capabilities is starting to integrate the immense power of digital systems and the mysterious hallmark of Homo sapiens - our capacity to experience insights and apply intuition. After decades of growing familiarity, we and our technological creations are poised to embark on what is sure to be a strange, exciting and deeply commingled evolutionary path.

Are we ready? Some signals suggest not. Even setting aside hyperbolic memes about our pending subservience to robot overlords, many are concerned about the impact of artificial intelligence and robotics on employment and the economy. A US survey last year by the Pew Research Center found that people are generally "more worried than enthusiastic" about breakthroughs that promise to integrate biology and technology, such as brain chip implants and engineered blood.

My particular vantage point on the future comes from leading the Defense Advanced Research Projects Agency (DARPA), the US government agency whose mission is to create breakthrough technologies for national security. Over six decades, we've sparked technological revolutions that ultimately led to some of today's most advanced materials and chip technologies, wave after wave of artificial intelligence, and the internet.

Today, Clark's work and Sorger's are part of the couple of hundred DARPA programmes opening the next technological frontier. And from my perspective, which embraces a wide swathe of research disciplines, it seems clear that we humans are on a path to a more symbiotic union with our machines.

What's drawing us forward is the lure of solutions to previously intractable problems, the prospect of advantageous enhancements to our inborn abilities, and the promise of improvements to the human condition. But as we stride into a future that will give our machines unprecedented roles in virtually every aspect of our lives, we humans - alone or even with the help of those machines - will need to wrangle some tough questions about the meaning of personal agency and autonomy, privacy and identity, authenticity and responsibility. Questions about who we are and what we want to be.

The internet: Inspired by the vision of computer scientist JCR Licklider, DARPA in 1969 demonstrated the first computer-to-computer communication system: a four-node network. It was the first in a long series of advances that led to today's global internet

Technology has long served as a window into our tangled inner nature. With every advance - from the earliest bone tools and stone hammers to today's jet engines and social media - our technologies have revealed and amplified our most creative and destructive sides.

For a long time, while technology was characterised primarily as "tools to help us do", the fear was that machines would turn ourselves into machines, like the blue-collar automatons in Charlie Chaplin's Modern Times. More recently have come "tools to help us think", and with them the opposite fear: that machines might soon grow smarter than us - or at least behave as though they are our boss.

Neither of these two fears has proven completely unfounded: witness, respectively, the daily hordes of zombie-like commuters staring at their phones, and today's debates about how and when to grant autonomy to driverless cars or military systems. But, although we're still grappling with these ideas, today, a third wave of technological innovation is starting, featuring machines that don't just help us do or think. They have the potential to help us be.

For some, this new symbiosis will feel like romance, and for others it will be a shotgun wedding. But either way it's worth understanding: how did we get here?

"A third wave of technological innovation is starting, featuring machines that don't just help us do or think - they have the potential to help us be" Arati Prabhakar, DARPA's former director

As with many revolutions, the roots of this emerging symbiosis run deep. All the way back in 1960, the visionary psychologist and computer pioneer JCR Licklider wrote with remarkable prescience of his hope "that, in not too many years, human brains and computing machines will be coupled together very tightly, and that the resulting partnership will think as no human brain has ever thought and process data in a way not approached by the information-handling machines we know today."

Licklider helped to launch the information revolution of the past half century, but the full realisation of this particular dream had to wait a few decades for two technology trends to mature.

The first of these trends is a direct outgrowth of that information revolution: today's big-bang-like expansion of capabilities in data science and artificial intelligence is coming into confluence with an unprecedented ability to incorporate in these systems human insights, expertise, context and common sense.

We humans have been very good, it turns out, at creating hugely complex systems - consider the multibillion-node internet, chips with billions of transistors, aircraft that have millions of individual components - and at collecting data about complex naturally occurring systems, from microbial interactions to climate dynamics to global patterns of societal behaviour.

But it's proven much more difficult to grasp how or why these super systems do what they do or what hidden nuggets of wisdom these datasets may contain, much less how to access that embedded knowledge to fuel further progress.

Here are some complex things we don't fully understand today: what is it about the combination of individually sensible algorithms that sometimes causes a flash crash on a stock exchange? What factors lead people in different parts of the world to develop a shared identity or sense of community, and what influences are most likely to break those bonds and fuel chaos, mass migration or revolution?

Of the countless factors that contribute to or protect against certain diseases, which are primary, and how do they interact? And for each of these puzzles, where are the most potent nodes or pressure points for which a modest fix might offer the biggest benefit? Today, as we humans start to work and think with our machines to transcend simple correlation - the powerful but ultimately limited basis of the initial wave of big-data applications - and perceive the deep linkages of causation, the answers to complex questions like these are coming within reach.

Autonomous vehicles: The 2004 DARPA Grand Challenge invited innovators to develop cars that could complete a course with no human on board. It stumped all entrants, but in 2005 a Stanford team won the prize and helped launch the revolution in self-driving cars

DARPA's Big Mechanism programme, of which Sorger is part, is one such effort, and it's not just about refining the picture of how drugs and genes work on melanoma cells. In another part of that programme, researchers have created machines that use advanced language processing algorithms to read scientific journal articles about particular cancer genes overnight, and then, each day, submit everything they've learned into a giant, continuously evolving model of cancer genetics.

These machines can read tens of thousands of scientific-journal articles per week - orders of magnitude more than a team of scientists could ingest - and can perform deep semantic analysis as they read, to reveal not just snapshots of cellular activities, but causal chains of biochemical reactions that enable the system to build quantitative models. In collaboration with human experts studying those results, the programme has begun generating promising new hypotheses about how to attack certain cancers with novel combinations of already approved medicines.

Along similar lines, the Bill & Melinda Gates Foundation has used DARPA-developed analytic tools to build a picture of the scores of factors related to child stunting, malnutrition and obesity. An effort of this scale would ordinarily take months of literature review, but it took just a few days to sketch out.

The resulting web of influences includes such disparate factors as breastfeeding, urbanisation and government subsidies for processed foods. It's a web to which humans can bring their own expertise - such as insights into the economic and political realities that might make it more practical to focus on one of those influences rather than another - so they can generate with their inanimate partners new public health strategies that scale from biochemistry to policy.

Cancer-research predicaments and the problems of childhood stunting and obesity are "Goldilocks" challenges - extremely difficult but plausibly tractable - in terms of the number of contributing factors and degrees of complexity that must be sorted through by humans and their machines to eke out solutions. What we learn from these efforts will have application in a range of national-security quandaries. Imagine, for example, developing practical insights by analytically modelling questions such as "What will be the impacts to critical infrastructure if a region's drought lasts for another five years?" and "To what degree might a country develop its human capital and economically empower women, and what impact would this have on its future political and economic trajectory?"

More broadly, it's difficult to find a programme at DARPA today that doesn't at some level aim to take advantage of a melding of human and computer traits and skill sets, from the design of novel synthetic chemicals, to the architecting of elaborate structures made possible by the advent of 3D additive manufacturing methods, to the command and control of unmanned aerial systems and management of the spectrum in highly congested communications environments.

Microsystems: From the micro-electromechanical chips that tell your phone when it has moved to the gallium arsenide circuits that transmit radio signals to cell towers, DARPA-initiated technologies have enabled the hand-held devices we so depend on today

To understand the second fast-moving trend that's supporting the emerging human-machine symbiosis, we need to move from hardware and software to wetware, and the progress being made in the field of neurotechnology.

Watch the video: BSE633A. PSI-BLAST Position Specific Iterated BLAST (August 2022).