Information

Help with STRING database data

Help with STRING database data


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I'm working with data downloaded from STRING database (string-db.org) for protein-protein interactions. My idea is to compare the topology of connections of the same protein on different organisms.

However, I noticed that the same protein can receive different ID on each organism.

So I would like to know if there is any way to convert all ID's into just one pattern.

Thanks.


Proteins evolve and have different sequences between species, so you would have to define what you mean with "same protein". One option would be to use an orthology database like eggNOG. (eggNOG has the same protein identifiers as STRING.) Then you could figure out 1:1 correspondences between proteins.

You probably also want to read up on Roded Sharan's work, e.g. Global alignment of protein-protein interaction networks.


If I understand it correctly you have downloaded for example 1000 protein sequences with 1000 IDs, but there are duplicates in sequences so in reality it is like having 600 unique sequences with 1000 ID's? If so it should be fairly easy to write a script which would create a set of unique sequences with all corresponding IDs so you could choose which one to use.

In python it could be done using the sequence as dictionary key with the ID as a value. While looping over each sequence check if the sequence is already in dictionary. If yes, append the new ID as a value. Finally you would get

seqs = { 'DFABIODFAFDIOAF… ':['ID001', 'ID007'], 'ANOTHERUNIQUESEQUENCE':['ID50'],… }

from which it should be easy to choose

TBH not sure about efficiency of this but this depends on the size of dataset? How large is it? Just give me a sample dataset and I can write it.


Homework help

How to be successful in studying when there are other activities and duties that take your time and prevent you from finishing workload? The answer is simple! Get help of professionals who can solve all the tasks for you. Whether you need math homework help, or any other subject, there are plenty of online specialists ready to assist. It's not shameful for ask for help. Modern people fill their schedule with numerous tasks that sometimes can't even cope with them. Students can be working, might have kids or parents to look after, or just be tired of constant routine. College homework help was created for all these purposes, and for one main reason: to give student some free time. You can join help server and there will be plenty of people like you looking for assistance in foreign languages, science, statistics and other subjects. For that a special website was created. It's called homework help discord server, where you share ideas and conclusions online. After joining community you choose the subject you need and can start conversation with other people.

The major advantage of discord is possibility to find all answers you need. Homework help would not only boost your grades, but increase your knowledge of the subject. Sometimes you don't need someone to make task for you, it's just a little hint that would result successfully finished task. Joining discord server you can help other people as well.


Coverage

When using STRING, please consult (and cite) the following references:
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, von Mering C.
STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.
Nucleic Acids Res. 2019 Jan 47:D607-613.PubMed

Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, Santos A, Doncheva NT, Roth A, Bork P, Jensen LJ, von Mering C.
The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible.
Nucleic Acids Res. 2017 Jan 45:D362-68.PubMed

Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP, Kuhn M, Bork P, Jensen LJ, von Mering C.
STRING v10: protein-protein interaction networks, integrated over the tree of life.
Nucleic Acids Res. 2015 Jan 43:D447-52.PubMed

Franceschini A, Lin J, von Mering C, Jensen LJ.
SVD-phy: improved prediction of protein functional associations through singular value decomposition of phylogenetic profiles.
Bioinformatics. 2015 Nov btv696.PubMed

Franceschini A, Szklarczyk D, Frankild S, Kuhn M, Simonovic M, Roth A, Lin J, Minguez P, Bork P, von Mering C, Jensen LJ.
STRING v9.1: protein-protein interaction networks, with increased coverage and integration.
Nucleic Acids Res. 2013 Jan 41:D808-15.PubMed

Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A, Minguez P, Doerks T, Stark M, Muller J, Bork P, Jensen LJ, von Mering C.
The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored.
Nucleic Acids Res. 2011 Jan 39:D561-8.PubMed

Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, Bork P, von Mering C.
STRING 8--a global view on proteins and their functional interactions in 630 organisms.
Nucleic Acids Res. 2009 Jan 37:D412-6.PubMed

von Mering C, Jensen LJ, Kuhn M, Chaffron S, Doerks T, Krueger B, Snel B, Bork P.
STRING 7--recent developments in the integration and prediction of protein interactions.
Nucleic Acids Res. 2007 Jan 35:D358-62.PubMed

von Mering C, Jensen LJ, Snel B, Hooper SD, Krupp M, Foglierini M, Jouffre N, Huynen MA, Bork P.
STRING: known and predicted protein-protein associations, integrated and transferred across organisms.
Nucleic Acids Res. 2005 Jan 33:D433-7.PubMed

von Mering C, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B.
STRING: a database of predicted functional associations between proteins.
Nucleic Acids Res. 2003 Jan 31:258-61.PubMed

Snel B, Lehmann G, Bork P, Huynen MA.
STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene.
Nucleic Acids Res. 2000 Sep 1528(18):3442-4.PubMed


SQL Server Data Types

String Data Types

Data type Description Max size Storage
char(n) Fixed width character string 8,000 characters Defined width
varchar(n) Variable width character string 8,000 characters 2 bytes + number of chars
varchar(max) Variable width character string 1,073,741,824 characters 2 bytes + number of chars
text Variable width character string 2GB of text data 4 bytes + number of chars
nchar Fixed width Unicode string 4,000 characters Defined width x 2
nvarchar Variable width Unicode string 4,000 characters
nvarchar(max) Variable width Unicode string 536,870,912 characters
ntext Variable width Unicode string 2GB of text data
binary(n) Fixed width binary string 8,000 bytes
varbinary Variable width binary string 8,000 bytes
varbinary(max) Variable width binary string 2GB
image Variable width binary string 2GB

Numeric Data Types

Data type Description Storage
bit Integer that can be 0, 1, or NULL
tinyint Allows whole numbers from 0 to 255 1 byte
smallint Allows whole numbers between -32,768 and 32,767 2 bytes
int Allows whole numbers between -2,147,483,648 and 2,147,483,647 4 bytes
bigint Allows whole numbers between -9,223,372,036,854,775,808 and 9,223,372,036,854,775,807 8 bytes
decimal(p,s) Fixed precision and scale numbers.

Allows numbers from -10^38 +1 to 10^38 –1.

The p parameter indicates the maximum total number of digits that can be stored (both to the left and to the right of the decimal point). p must be a value from 1 to 38. Default is 18.

The s parameter indicates the maximum number of digits stored to the right of the decimal point. s must be a value from 0 to p. Default value is 0

Allows numbers from -10^38 +1 to 10^38 –1.

The p parameter indicates the maximum total number of digits that can be stored (both to the left and to the right of the decimal point). p must be a value from 1 to 38. Default is 18.

The s parameter indicates the maximum number of digits stored to the right of the decimal point. s must be a value from 0 to p. Default value is 0

The n parameter indicates whether the field should hold 4 or 8 bytes. float(24) holds a 4-byte field and float(53) holds an 8-byte field. Default value of n is 53.


Comparison of DAVID with related programs

Several other programs have overlapping and related functionality when compared with DAVID, but none combines all of DAVID's features within a single platform. These programs include ENSMART [16], FatiGO [17], GeneLynx [18], GoMiner [19], MAPPFinder [20], MatchMiner [21], Resourcerer [22] and Source [23], which collectively fall into two general categories: exploratory tools, defined as combining functional annotation with some form of graphical representation of summarized data and annotation tools, defined as providing query-based access to functional annotation and producing a tabular output. FatiGO, GoMiner, and MAPPFinder are exploratory tools, whereas ENSMART, GeneLynx, MatchMiner, Resourcerer, and Source are strictly annotation tools that produce tabular output. A major advantage of DAVID is that it combines features of both categories, with GoCharts, KeggCharts, and DomainCharts representing exploratory tools, while DAVID's Annotation Tool produces a tabular output of functional annotation. We compared DAVID and these related programs on the basis of their available implementations and documentation as of May 2003, and the distribution of DAVID's functional features among these programs is shown in Table 3.

Exploratory tools

FatiGO is a web-accessible application that functions in much the same way as DAVID's GoCharts, including the ability to specify term-specificity level. Unlike DAVID, FatiGO does not allow the setting of a minimum hit threshold for simplified viewing of only the most highly represented functional categories. Likewise, FatiGO limits the graphical output to only one top-level GO category at a time, whereas DAVID allows the combined viewing of biological process, molecular function, and cellular component annotations simultaneously. FatiGO's static barchart output looks very similar to DAVID's GoChart an important distinction is that DAVID's GoCharts are dynamic, allowing users to drill-down and traverse the GO hierarchy for any subset of genes, view the underlying chart data and associated annotations, and link out to external data repositories including LocusLink and QuickGO. As shown in Table 3 the majority of accession types accepted and functional annotations offered by DAVID are not available from FatiGO.

GoMiner is a standalone Java application that requires downloading of the program itself along with at least two auxiliary files, one for DAG visualization and another for protein structural visualization. The remote database queried by GoMiner is reported to be updated every six months. It has been our experience that, to accurately reflect the current knowledge associated with a given gene, functional annotation data must be updated far more frequently. If users wish to use GoMiner with a local copy of its annotation database, they must also download and install a local copy of the MySQL database and the required drivers, a process that may be difficult for inexperienced users of MySQL. In contrast, DAVID is web-accessible and updated weekly. The functionality of GoMiner is most similar to DAVID's GoCharts module. An enhanced feature of GoMiner is that it provides intuitive tree and DAG views of genes embedded within the GO hierarchy. DAVID has the ability to display such views through hyperlinks of GO terms to QuickGO's tree and DAG views. A unique function provided by DAVID is the ability to drill-down and traverse the GO hierarchy for any subset of genes sharing a common classification, as demonstrated by the identification of stress response genes with cytokine activity. Neither the tree nor DAG view of GoMiner provides this functionality.

The body of biological knowledge associated with any list of genes extends far beyond the structured vocabulary of GO. DAVID provides, in addition to GoCharts, two additional analysis modules that utilize PFAM protein domain designations and KEGG biochemical pathways to graphically summarize the distribution of genes among functional domains and pathways. Moreover, DAVID highlights pathway members within the biochemical pathways provided by KEGG. Whereas GoMiner provides hyperlinks to pathway databases such as BioCarta and KEGG for individual genes, lists of genes can only be batch processed in the context of GO. In addition to providing hyperlinks to external data repositories for each gene, DAVID provides links to primary sequence information available at NCBI and human-curated functional summaries parsed from LocusLink. These features are not available in GoMiner. DAVID can be used to collect, analyze and explore functional annotation associated with human, mouse, rat, and Drosophila gene lists, whereas GoMiner is restricted to analyzing human data. Another restrictive feature of GoMiner is that it only takes HUGO gene symbols as input. This is problematic in that many genes and expressed sequence tags (ESTs) do not have HUGO symbols. Moreover, this restriction requires the translation of every gene list into HUGO symbols.

Like GoMiner, MAPPFinder is a stand-alone, exploratory tool for the analysis of lists of genes within the context of GO. The downloadable program comes with a copy of the supporting relational database of gene to GO-term associations. However, as with GoMiner there are important considerations regarding the installation, support, and updating of the software and underlying database, as indicated by the documentation and bug reports listed on their website. Importantly, in addition to the batch processing of gene lists within the context of GO, MAPPFinder provides functionality similar to that of DAVID's KeggCharts, providing the ability to view lists of genes within the context of biochemical pathways. However, in order to use this functionality through MAPPFinder, users must download additional programs and files, including the GenMAPP program and its associated MAPP files, whereas the KeggCharts module of DAVID is easily accessible at the click of a button.

Annotation tools

ENSMART is a web-accessible application that integrates an enormous amount of functional annotation for numerous species. ENSMART takes as input lists of several accession types, including Affymetrix probe sets, making it quite flexible. Database cross-references provided by ENSMART cover a broad spectrum of functional annotations pertaining to gene- and protein-specific attributes as well as disease and cross-species attributes. However, users are limited to a maximum of three cross-references for a given gene list. Unlike DAVID, ENSMART does not provide graphic summaries of GO categories, protein domains, or biochemical pathway membership, nor does ENSMART provide the ability to drill-down within groups of genes sharing common functional features.

GeneLynx and Source are highly similar web-accessible annotation tools that provide a wealth of gene-specific information for individual genes and both are flexible in that they take as input several different accession types. However, the rich information and available hyperlinks provided in single-gene mode is lost when either GeneLynx or Source are used to batch process lists of genes. The output of batch processing with Source is a text-style table that is feasible for download and automated processing, but provides little utility for interactive exploration. Although GeneLynx can perform batch searching for a list of genes, functional annotations must be viewed one gene at a time.

MatchMiner is a companion program of GoMiner that performs the translations of gene accession types into the HUGO symbols required by GoMiner. MatchMiner is simply a web-accessible resource for translating accession types. It takes several accession types but does not take LocusLink numbers, and although it was reported to accept identifiers from Affymetrix chip sets, MatchMiner returned no data for several gene lists composed of HuFL6800 probe sets. Notably, MatchMiner does not provide any functional annotation and is restricted to human data. Thus, within the context of the other exploratory and annotation tools discussed here, MatchMiner's utility is limited, or supportive, at best.

Resourcerer is a web-accessible application for comparing and annotating human, mouse, and rat GeneChip and microarray platforms. A major feature of Resourcerer is its broad coverage of microarray platforms and its ability to identify overlapping gene targets between chips, even across technology platforms and species barriers. Resourcerer's output is in tabular form and provides hyperlinks to accession cross-references such as GenBank and UniGene. Resourcerer does not provide graphic summaries or annotations from GO, PFAM, KEGG, or any other resource, thus limiting its utility as a tool for functional annotation.


Parameter Short Form Value Description
/Action: /a Publish Specifies the action to be performed.
/AccessToken: /at Specifies the token based authentication access token to use when connect to the target database.
/AzureKeyVaultAuthMethod: /akv Specifies what authentication method is used for accessing Azure KeyVault if a publish operation includes modifications to an encrypted table/column.
/ClientId: /cid Specifies the Client ID to be used in authenticating against Azure KeyVault, when necessary
/DeployScriptPath: /dsp Specifies an optional file path to output the deployment script. For Azure deployments, if there are TSQL commands to create or modify the master database, a script will be written to the same path but with "Filename_Master.sql" as the output file name.
/DeployReportPath: /drp Specifies an optional file path to output the deployment report xml file.
/Diagnostics: /d Specifies whether diagnostic logging is output to the console. Defaults to False.
/DiagnosticsFile: /df Specifies a file to store diagnostic logs.
/MaxParallelism: /mp Specifies the degree of parallelism for concurrent operations running against a database. The default value is 8.
/OverwriteFiles: /of Specifies if sqlpackage.exe should overwrite existing files. Specifying false causes sqlpackage.exe to abort action if an existing file is encountered. Default value is True.
/Profile: /pr Specifies the file path to a DAC Publish Profile. The profile defines a collection of properties and variables to use when generating outputs.
/Properties: /p = Specifies a name value pair for an action-specific property=. Refer to the help for a specific action to see that action's property names. Example: sqlpackage.exe /Action:Publish /?.
/Quiet: /q Specifies whether detailed feedback is suppressed. Defaults to False.
/Secret: /secr Specifies the Client Secret to be used in authenticating against Azure KeyVault, when necessary
/SourceConnectionString: /scs Specifies a valid SQL Server/Azure connection string to the source database. If this parameter is specified, it shall be used exclusively of all other source parameters.
/SourceDatabaseName: /sdn Defines the name of the source database.
/SourceEncryptConnection: /sec Specifies if SQL encryption should be used for the source database connection.
/SourceFile: /sf Specifies a source file to be used as the source of action instead of a database. If this parameter is used, no other source parameter shall be valid.
/SourcePassword: /sp For SQL Server Auth scenarios, defines the password to use to access the source database.
/SourceServerName: /ssn Defines the name of the server hosting the source database.
/SourceTimeout: /st Specifies the timeout for establishing a connection to the source database in seconds.
/SourceTrustServerCertificate: /stsc Specifies whether to use TLS to encrypt the source database connection and bypass walking the certificate chain to validate trust.
/SourceUser: /su For SQL Server Auth scenarios, defines the SQL Server user to use to access the source database.
/TargetConnectionString: /tcs Specifies a valid SQL Server/Azure connection string to the target database. If this parameter is specified, it shall be used exclusively of all other target parameters.
/TargetDatabaseName: /tdn Specifies an override for the name of the database that is the target of sqlpackage.exe Action.
/TargetEncryptConnection: /tec Specifies if SQL encryption should be used for the target database connection.
/TargetPassword: /tp For SQL Server Auth scenarios, defines the password to use to access the target database.
/TargetServerName: /tsn Defines the name of the server hosting the target database.
/TargetTimeout: /tt Specifies the timeout for establishing a connection to the target database in seconds. For Azure AD, it is recommended that this value be greater than or equal to 30 seconds.
/TargetTrustServerCertificate: /ttsc Specifies whether to use TLS to encrypt the target database connection and bypass walking the certificate chain to validate trust.
/TargetUser: /tu For SQL Server Auth scenarios, defines the SQL Server user to use to access the target database.
/TenantId: /tid Represents the Azure AD tenant ID or domain name. This option is required to support guest or imported Azure AD users as well as Microsoft accounts such as outlook.com, hotmail.com, or live.com. If this parameter is omitted, the default tenant ID for Azure AD will be used, assuming that the authenticated user is a native user for this AD. However, in this case any guest or imported users and/or Microsoft accounts hosted in this Azure AD are not supported and the operation will fail.
For more information about Active Directory Universal Authentication, see Universal Authentication with SQL Database and Azure Synapse Analytics (SSMS support for MFA).
/UniversalAuthentication: /ua Specifies if Universal Authentication should be used. When set to True, the interactive authentication protocol is activated supporting MFA. This option can also be used for Azure AD authentication without MFA, using an interactive protocol requiring the user to enter their username and password or integrated authentication (Windows credentials). When /UniversalAuthentication is set to True, no Azure AD authentication can be specified in SourceConnectionString (/scs). When /UniversalAuthentication is set to False, Azure AD authentication must be specified in SourceConnectionString (/scs).
For more information about Active Directory Universal Authentication, see Universal Authentication with SQL Database and Azure Synapse Analytics (SSMS support for MFA).
/Variables: /v = Specifies a name value pair for an action-specific variable=. The DACPAC file contains the list of valid SQLCMD variables. An error results if a value is not provided for every variable.
Property Value Description
/p: AdditionalDeploymentContributorArguments=(STRING) Specifies additional deployment contributor arguments for the deployment contributors. This should be a semi-colon delimited list of values.
/p: AdditionalDeploymentContributors=(STRING) Specifies additional deployment contributors, which should run when the dacpac is deployed. This should be a semi-colon delimited list of fully qualified build contributor names or IDs.
/p: AdditionalDeploymentContributorPaths=(STRING) Specifies paths to load additional deployment contributors. This should be a semi-colon delimited list of values.
/p: AllowDropBlockingAssemblies=(BOOLEAN) This property is used by SqlClr deployment to cause any blocking assemblies to be dropped as part of the deployment plan. By default, any blocking/referencing assemblies will block an assembly update if the referencing assembly needs to be dropped.
/p: AllowIncompatiblePlatform=(BOOLEAN) Specifies whether to attempt the action despite incompatible SQL Server platforms.
/p: AllowUnsafeRowLevelSecurityDataMovement=(BOOLEAN) Do not block data motion on a table that has Row Level Security if this property is set to true. Default is false.
/p: AzureSharedAccessSignatureToken=(STRING) Azure shared access signature (SAS) token, see SqlPackage for Azure Synapse Analytics.
/p: AzureStorageBlobEndpoint=(STRING) Azure blob storage endpoint, see SqlPackage for Azure Synapse Analytics.
/p: AzureStorageContainer=(STRING) Azure blob storage container, see SqlPackage for Azure Synapse Analytics.
/p: AzureStorageKey=(STRING) Azure storage account key, see SqlPackage for Azure Synapse Analytics.
/p: AzureStorageRootPath=(STRING) Storage root path within the container. Without this property, the path defaults to servername/databasename/timestamp/ . See SqlPackage for Azure Synapse Analytics.
/p: BackupDatabaseBeforeChanges=(BOOLEAN) Backups the database before deploying any changes.
/p: BlockOnPossibleDataLoss=(BOOLEAN 'True') Specifies that the operation will be terminated during the schema validation step if the resulting schema changes could incur a loss of data, including due to data precision reduction or a data type change that requires a cast operation. The default ( True ) value causes the operation to terminate regardless if the target database contains data. An execution with a False value for BlockOnPossibleDataLoss can still fail during deployment plan execution if data is present on the target that cannot be converted to the new column type.
/p: BlockWhenDriftDetected=(BOOLEAN 'True') Specifies whether to block updating a database whose schema no longer matches its registration or is unregistered.
/p: CommandTimeout=(INT32 '60') Specifies the command timeout in seconds when executing queries against SQL Server.
/p: CommentOutSetVarDeclarations=(BOOLEAN) Specifies whether the declaration of SETVAR variables should be commented out in the generated publish script. You might choose to do this if you plan to specify the values on the command line when you publish by using a tool such as SQLCMD.EXE.
/p: CompareUsingTargetCollation=(BOOLEAN) This setting dictates how the database's collation is handled during deployment by default the target database's collation will be updated if it does not match the collation specified by the source. When this option is set, the target database's (or server's) collation should be used.
/p: CreateNewDatabase=(BOOLEAN) Specifies whether the target database should be updated or whether it should be dropped and re-created when you publish to a database.
/p: DatabaseEdition=( 'Default') Defines the edition of an Azure SQL Database.
/p: DatabaseLockTimeout=(INT32 '60') Specifies the database lock timeout in seconds when executing queries against SQLServer. Use -1 to wait indefinitely.
/p: DatabaseMaximumSize=(INT32) Defines the maximum size in GB of an Azure SQL Database.
/p: DatabaseServiceObjective=(STRING) Defines the performance level of an Azure SQL Database such as"P0" or "S1".
/p: DeployDatabaseInSingleUserMode=(BOOLEAN) if true, the database is set to Single User Mode before deploying.
/p: DisableAndReenableDdlTriggers=(BOOLEAN 'True') Specifies whether Data Definition Language (DDL) triggers are disabled at the beginning of the publish process and re-enabled at the end of the publish action.
/p: DoNotAlterChangeDataCaptureObjects=(BOOLEAN 'True') If true, Change Data Capture objects are not altered.
/p: DoNotAlterReplicatedObjects=(BOOLEAN 'True') Specifies whether objects that are replicated are identified during verification.
/p: DoNotDropObjectType=(STRING) An object type that should not be dropped when DropObjectsNotInSource is true. Valid object type names are Aggregates, ApplicationRoles, Assemblies, AsymmetricKeys, BrokerPriorities, Certificates, ColumnEncryptionKeys, ColumnMasterKeys, Contracts, DatabaseRoles, DatabaseTriggers, Defaults, ExtendedProperties, ExternalDataSources, ExternalFileFormats, ExternalTables, Filegroups, FileTables, FullTextCatalogs, FullTextStoplists, MessageTypes, PartitionFunctions, PartitionSchemes, Permissions, Queues, RemoteServiceBindings, RoleMembership, Rules, ScalarValuedFunctions, SearchPropertyLists, SecurityPolicies, Sequences, Services, Signatures, StoredProcedures, SymmetricKeys, Synonyms, Tables, TableValuedFunctions, UserDefinedDataTypes, UserDefinedTableTypes, ClrUserDefinedTypes, Users, Views, XmlSchemaCollections, Audits, Credentials, CryptographicProviders, DatabaseAuditSpecifications, DatabaseScopedCredentials, Endpoints, ErrorMessages, EventNotifications, EventSessions, LinkedServerLogins, LinkedServers, Logins, Routes, ServerAuditSpecifications, ServerRoleMembership, ServerRoles, ServerTriggers.
/p: DoNotDropObjectTypes=(STRING) A semicolon-delimited list of object types that should not be dropped when DropObjectsNotInSource is true. Valid object type names are Aggregates, ApplicationRoles, Assemblies, AsymmetricKeys, BrokerPriorities, Certificates, ColumnEncryptionKeys, ColumnMasterKeys, Contracts, DatabaseRoles, DatabaseTriggers, Defaults, ExtendedProperties, ExternalDataSources, ExternalFileFormats, ExternalTables, Filegroups, FileTables, FullTextCatalogs, FullTextStoplists, MessageTypes, PartitionFunctions, PartitionSchemes, Permissions, Queues, RemoteServiceBindings, RoleMembership, Rules, ScalarValuedFunctions, SearchPropertyLists, SecurityPolicies, Sequences, Services, Signatures, StoredProcedures, SymmetricKeys, Synonyms, Tables, TableValuedFunctions, UserDefinedDataTypes, UserDefinedTableTypes, ClrUserDefinedTypes, Users, Views, XmlSchemaCollections, Audits, Credentials, CryptographicProviders, DatabaseAuditSpecifications, DatabaseScopedCredentials, Endpoints, ErrorMessages, EventNotifications, EventSessions, LinkedServerLogins, LinkedServers, Logins, Routes, ServerAuditSpecifications, ServerRoleMembership, ServerRoles, ServerTriggers.
/p: DropConstraintsNotInSource=(BOOLEAN 'True') Specifies whether constraints that do not exist in the database snapshot (.dacpac) file will be dropped from the target database when you publish to a database.
/p: DropDmlTriggersNotInSource=(BOOLEAN 'True') Specifies whether DML triggers that do not exist in the database snapshot (.dacpac) file will be dropped from the target database when you publish to a database.
/p: DropExtendedPropertiesNotInSource=(BOOLEAN 'True') Specifies whether extended properties that do not exist in the database snapshot (.dacpac) file will be dropped from the target database when you publish to a database.
/p: DropIndexesNotInSource=(BOOLEAN 'True') Specifies whether indexes that do not exist in the database snapshot (.dacpac) file will be dropped from the target database when you publish to a database.
/p: DropObjectsNotInSource=(BOOLEAN) Specifies whether objects that do not exist in the database snapshot (.dacpac) file will be dropped from the target database when you publish to a database. This value takes precedence over DropExtendedProperties.
/p: DropPermissionsNotInSource=(BOOLEAN) Specifies whether permissions that do not exist in the database snapshot (.dacpac) file will be dropped from the target database when you publish updates to a database.
/p: DropRoleMembersNotInSource=(BOOLEAN) Specifies whether role members that are not defined in the database snapshot (.dacpac) file will be dropped from the target database when you publish updates to a database.
/p: DropStatisticsNotInSource=(BOOLEAN 'True') Specifies whether statistics that do not exist in the database snapshot (.dacpac) file will be dropped from the target database when you publish to a database.
/p: ExcludeObjectType=(STRING) An object type that should be ignored during deployment. Valid object type names are Aggregates, ApplicationRoles, Assemblies, AsymmetricKeys, BrokerPriorities, Certificates, ColumnEncryptionKeys, ColumnMasterKeys, Contracts, DatabaseRoles, DatabaseTriggers, Defaults, ExtendedProperties, ExternalDataSources, ExternalFileFormats, ExternalTables, Filegroups, FileTables, FullTextCatalogs, FullTextStoplists, MessageTypes, PartitionFunctions, PartitionSchemes, Permissions, Queues, RemoteServiceBindings, RoleMembership, Rules, ScalarValuedFunctions, SearchPropertyLists, SecurityPolicies, Sequences, Services, Signatures, StoredProcedures, SymmetricKeys, Synonyms, Tables, TableValuedFunctions, UserDefinedDataTypes, UserDefinedTableTypes, ClrUserDefinedTypes, Users, Views, XmlSchemaCollections, Audits, Credentials, CryptographicProviders, DatabaseAuditSpecifications, DatabaseScopedCredentials, Endpoints, ErrorMessages, EventNotifications, EventSessions, LinkedServerLogins, LinkedServers, Logins, Routes, ServerAuditSpecifications, ServerRoleMembership, ServerRoles, ServerTriggers.
/p: ExcludeObjectTypes=(STRING) A semicolon-delimited list of object types that should be ignored during deployment. Valid object type names are Aggregates, ApplicationRoles, Assemblies, AsymmetricKeys, BrokerPriorities, Certificates, ColumnEncryptionKeys, ColumnMasterKeys, Contracts, DatabaseRoles, DatabaseTriggers, Defaults, ExtendedProperties, ExternalDataSources, ExternalFileFormats, ExternalTables, Filegroups, FileTables, FullTextCatalogs, FullTextStoplists, MessageTypes, PartitionFunctions, PartitionSchemes, Permissions, Queues, RemoteServiceBindings, RoleMembership, Rules, ScalarValuedFunctions, SearchPropertyLists, SecurityPolicies, Sequences, Services, Signatures, StoredProcedures, SymmetricKeys, Synonyms, Tables, TableValuedFunctions, UserDefinedDataTypes, UserDefinedTableTypes, ClrUserDefinedTypes, Users, Views, XmlSchemaCollections, Audits, Credentials, CryptographicProviders, DatabaseAuditSpecifications, DatabaseScopedCredentials, Endpoints, ErrorMessages, EventNotifications, EventSessions, LinkedServerLogins, LinkedServers, Logins, Routes, ServerAuditSpecifications, ServerRoleMembership, ServerRoles, ServerTriggers.
/p: GenerateSmartDefaults=(BOOLEAN) Automatically provides a default value when updating a table that contains data with a column that does not allow null values.
/p: IgnoreAnsiNulls=(BOOLEAN 'True') Specifies whether differences in the ANSI NULLS setting should be ignored or updated when you publish to a database.
/p: IgnoreAuthorizer=(BOOLEAN) Specifies whether differences in the Authorizer should be ignored or updated when you publish to a database.
/p: IgnoreColumnCollation=(BOOLEAN) Specifies whether differences in the column collations should be ignored or updated when you publish to a database.
/p: IgnoreColumnOrder=(BOOLEAN) Specifies whether differences in table column order should be ignored or updated when you publish to a database.
/p: IgnoreComments=(BOOLEAN) Specifies whether differences in the comments should be ignored or updated when you publish to a database.
/p: IgnoreCryptographicProviderFilePath=(BOOLEAN 'True') Specifies whether differences in the file path for the cryptographic provider should be ignored or updated when you publish to a database.
/p: IgnoreDdlTriggerOrder=(BOOLEAN) Specifies whether differences in the order of Data Definition Language (DDL) triggers should be ignored or updated when you publish to a database or server.
/p: IgnoreDdlTriggerState=(BOOLEAN) Specifies whether differences in the enabled or disabled state of Data Definition Language (DDL) triggers should be ignored or updated when you publish to a database.
/p: IgnoreDefaultSchema=(BOOLEAN) Specifies whether differences in the default schema should be ignored or updated when you publish to a database.
/p: IgnoreDmlTriggerOrder=(BOOLEAN) Specifies whether differences in the order of Data Manipulation Language (DML) triggers should be ignored or updated when you publish to a database.
/p: IgnoreDmlTriggerState=(BOOLEAN) Specifies whether differences in the enabled or disabled state of DML triggers should be ignored or updated when you publish to a database.
/p: IgnoreExtendedProperties=(BOOLEAN) Specifies whether differences in the extended properties should be ignored or updated when you publish to a database.
/p: IgnoreFileAndLogFilePath=(BOOLEAN 'True') Specifies whether differences in the paths for files and log files should be ignored or updated when you publish to a database.
/p: IgnoreFilegroupPlacement=(BOOLEAN 'True') Specifies whether differences in the placement of objects in FILEGROUPs should be ignored or updated when you publish to a database.
/p: IgnoreFileSize=(BOOLEAN 'True') Specifies whether differences in the file sizes should be ignored or whether a warning should be issued when you publish to a database.
/p: IgnoreFillFactor=(BOOLEAN 'True') Specifies whether differences in the fill factor for index storage should be ignored or whether a warning should be issued when you publish to a database.
/p: IgnoreFullTextCatalogFilePath=(BOOLEAN 'True') Specifies whether differences in the file path for the full-text catalog should be ignored or whether a warning should be issued when you publish to a database.
/p: IgnoreIdentitySeed=(BOOLEAN) Specifies whether differences in the seed for an identity column should be ignored or updated when you publish updates to a database.
/p: IgnoreIncrement=(BOOLEAN) Specifies whether differences in the increment for an identity column should be ignored or updated when you publish to a database.
/p: IgnoreIndexOptions=(BOOLEAN) Specifies whether differences in the index options should be ignored or updated when you publish to a database.
/p: IgnoreIndexPadding=(BOOLEAN 'True') Specifies whether differences in the index padding should be ignored or updated when you publish to a database.
/p: IgnoreKeywordCasing=(BOOLEAN 'True') Specifies whether differences in the casing of keywords should be ignored or updated when you publish to a database.
/p: IgnoreLockHintsOnIndexes=(BOOLEAN) Specifies whether differences in the lock hints on indexes should be ignored or updated when you publish to a database.
/p: IgnoreLoginSids=(BOOLEAN 'True') Specifies whether differences in the security identification number (SID) should be ignored or updated when you publish to a database.
/p: IgnoreNotForReplication=(BOOLEAN) Specifies whether the not for replication settings should be ignored or updated when you publish to a database.
/p: IgnoreObjectPlacementOnPartitionScheme=(BOOLEAN 'True') Specifies whether an object's placement on a partition scheme should be ignored or updated when you publish to a database.
/p: IgnorePartitionSchemes=(BOOLEAN) Specifies whether differences in partition schemes and functions should be ignored or updated when you publish to a database.
/p: IgnorePermissions=(BOOLEAN) Specifies whether differences in the permissions should be ignored or updated when you publish to a database.
/p: IgnoreQuotedIdentifiers=(BOOLEAN 'True') Specifies whether differences in the quoted identifiers setting should be ignored or updated when you publish to a database.
/p: IgnoreRoleMembership=(BOOLEAN) Specifies whether differences in the role membership of logins should be ignored or updated when you publish to a database.
/p: IgnoreRouteLifetime=(BOOLEAN 'True') Specifies whether differences in the amount of time that SQL Server retains the route in the routing table should be ignored or updated when you publish to a database.
/p: IgnoreSemicolonBetweenStatements=(BOOLEAN 'True') Specifies whether differences in the semi-colons between T-SQL statements will be ignored or updated when you publish to a database.
/p: IgnoreTableOptions=(BOOLEAN) Specifies whether differences in the table options will be ignored or updated when you publish to a database.
/p: IgnoreTablePartitionOptions=(BOOLEAN) Specifies whether differences in the table partition options will be ignored or updated when you publish to a database. This option applies only to Azure Synapse Analytics dedicated SQL pool databases.
/p: IgnoreUserSettingsObjects=(BOOLEAN) Specifies whether differences in the user settings objects will be ignored or updated when you publish to a database.
/p: IgnoreWhitespace=(BOOLEAN 'True') Specifies whether differences in white space will be ignored or updated when you publish to a database.
/p: IgnoreWithNocheckOnCheckConstraints=(BOOLEAN) Specifies whether differences in the value of the WITH NOCHECK clause for check constraints will be ignored or updated when you publish.
/p: IgnoreWithNocheckOnForeignKeys=(BOOLEAN) Specifies whether differences in the value of the WITH NOCHECK clause for foreign keys will be ignored or updated when you publish to a database.
/p: IncludeCompositeObjects=(BOOLEAN) Include all composite elements as part of a single publish operation.
/p: IncludeTransactionalScripts=(BOOLEAN) Specifies whether transactional statements should be used where possible when you publish to a database.
/p: LongRunningCommandTimeout=(INT32) Specifies the long running command timeout in seconds when executing queries against SQL Server. Use 0 to wait indefinitely.
/p: NoAlterStatementsToChangeClrTypes=(BOOLEAN) Specifies that publish should always drop and re-create an assembly if there is a difference instead of issuing an ALTER ASSEMBLY statement.
/p: PopulateFilesOnFileGroups=(BOOLEAN 'True') Specifies whether a new file is also created when a new FileGroup is created in the target database.
/p: RegisterDataTierApplication=(BOOLEAN) Specifies whether the schema is registered with the database server.
/p: RunDeploymentPlanExecutors=(BOOLEAN) Specifies whether DeploymentPlanExecutor contributors should be run when other operations are executed.
/p: ScriptDatabaseCollation=(BOOLEAN) Specifies whether differences in the database collation should be ignored or updated when you publish to a database.
/p: ScriptDatabaseCompatibility=(BOOLEAN) Specifies whether differences in the database compatibility should be ignored or updated when you publish to a database.
/p: ScriptDatabaseOptions=(BOOLEAN 'True') Specifies whether target database properties should be set or updated as part of the publish action.
/p: ScriptDeployStateChecks=(BOOLEAN) Specifies whether statements are generated in the publish script to verify that the database name and server name match the names specified in the database project.
/p: ScriptFileSize=(BOOLEAN) Controls whether size is specified when adding a file to a filegroup.
/p: ScriptNewConstraintValidation=(BOOLEAN 'True') At the end of publish all of the constraints will be verified as one set, avoiding data errors caused by a check or foreign key constraint in the middle of publish. If set to False, your constraints are published without checking the corresponding data.
/p: ScriptRefreshModule=(BOOLEAN 'True') Include refresh statements at the end of the publish script.
/p: Storage=() Specifies how elements are stored when building the database model. For performance reasons the default is InMemory. For large databases, File backed storage is required.
/p: TreatVerificationErrorsAsWarnings=(BOOLEAN) Specifies whether errors encountered during publish verification should be treated as warnings. The check is performed against the generated deployment plan before the plan is executed against your target database. Plan verification detects problems such as the loss of target-only objects (such as indexes) that must be dropped to make a change. Verification will also detect situations where dependencies (such as a table or view) exist because of a reference to a composite project, but do not exist in the target database. You might choose to do this to get a complete list of all issues, instead of having the publish action stop on the first error.
/p: UnmodifiableObjectWarnings=(BOOLEAN 'True') Specifies whether warnings should be generated when differences are found in objects that cannot be modified, for example, if the file size or file paths were different for a file.
/p: VerifyCollationCompatibility=(BOOLEAN 'True') Specifies whether collation compatibility is verified.
/p: VerifyDeployment=(BOOLEAN 'True') Specifies whether checks should be performed before publishing that will stop the publish action if issues are present that might block successful publishing. For example, your publish action might stop if you have foreign keys on the target database that do not exist in the database project, and that causes errors when you publish.

Description

Month

Generates a random company name, comprised of a lorem ipsum word and an appropriate suffix, like Dolor Inc., or Convallis Limited.

This Data Type generates a random SIRET/SIREN French business identification number.

SIRET:

SIREN:

More info:

Generates a personal number, used in some countries for social security insurance. At the present time only swedish ones are supported. The personal numbers are generated according to the format you specify:

PersonalNumberWithoutHyphen

PersonalNumberWithHyphen

Generates organisation numbers, used in some countries for registration of companies, associations etc. At the present time only Swedish ones are supported. The organisation numbers are generated according to the format you specify:

OrganisationNumberWithoutHyphen

OrganisationNumberWithHyphen

Generates random Canadian provinces, states, territories or counties, based on the options you select. The Full Name and Abbreviation sub-options determine whether the output will contain the full string (e.g. "British Columbia") or its abbreviation (e.g. "BC"). For UK counties, the abbreviation is the standard 3-character Chapman code.

This data type generates a random latitude and/or longitude. If both are selected, it displays both separated by a comma.

This data type generates random, valid credit card numbers according to the format you specify. It is currently capable of generating numbers for the following brands: Mastercard, Visa, Visa Electron, American Express, Discover, American Diner's, Carte Blanche, Diner's Club International, , JCB, Maestro, Solo, Switch, Laser.

Generates a random credit card PIN number from 1111 to 9999.

Generates a random credit card CVV number from 111 to 999.

This option generates a fixed number of random words, pulled from the standard lorem ipsum latin text.

This option generates a random number of words - the total number within the range that you specify (inclusive). As with the Fixed number option, the words are pulled the standard lorem ipsum latin text.

This Data Type lets you generate random alpha-numeric strings. The following table contains the character legend for this field. Any other characters you enter into this field will appear unescaped.

Generates a Boolean value in the format you need. You can specify multiple formats by separating them with the pipe (|) character. The following strings will be converted to their Boolean equivalent:

  • Yes or No
  • False or True
  • 0 or 1
  • Y or N
  • F or T
  • false or true

true and false values are special. Depending on the export type, these may be output without double quotes.

Generates a column that contains a unique number on each row, incrementing by whatever value you enter. This option can be helpful for inserting the data into a database field with an auto-increment primary key.

The optional placeholder string lets you embed the generated increment value within a string, via the placeholder. For example:

This randomly generates a number between the values you specify. Both fields allow you to enter negative numbers.

This data type generates random currency values, in whatever format and range you want. The example dropdown contains several options so you can get a sense of how it works, but here's what each of the options means.

Format

Range - From

Range - To

Currency Symbol

Prefix/Suffix

This data type lets you generate a column of data that has repeating values from row to row. Here's a couple of examples to give you an idea of how this works.

  • If you'd like to provide the value "1" for every row, you can enter "1" in the Value(s) field and any value (>0) in the Loop Count field.
  • If you'd like to have 100 rows of the string "Male" followed by 100 rows of the string "Female" and repeat, you can enter "100" in the Loop Count field and "Male|Female" in the Value(s) field.
  • If you'd like 5 rows of 1 through 10, enter "5" for the Loop Count field, and "1|2|3|4|5|6|7|8|9|10" in the Value(s) field.

Try tinkering around with it. You'll get the idea.

The Composite data type lets you combine the data from any other row or rows, and manipulate it, change it, combine the information and more. The content should be entered in the Smarty templating language.

To output the value from any row, just use the placeholders , , etc. You cannot refer to the current row - that would either melt the server and/or make the universe implode.

  • Display a value from row 6:
  • Assuming row 1 and row 2 contain random numbers, the following are examples of some simple math:
    • - subtraction
    • - multiplication
    • <$ROW2/$ROW1> - division

    Please see the Smarty website for more information on the syntax.

    This data type lets you generate tree-like data in which every row is a child of another row - except the very first row, which is the trunk of the tree. This data type must be used in conjunction with the Auto-Increment data type: that ensures that every row has a unique numeric value, which this data type uses to reference the parent rows.

    The options let you specify which of your form fields is the appropriate auto-increment field and the maximum number of children a node may have.

    Enter a list of items, separated by a pipe | character. Then select whether you want Exactly X number of items, or At most X items from the list. Multiple items are returned in a comma-delimited list in the results. If you want your data set to include empty values, just add one or more pipe characters at the end - the more pipes you enter, the greater the probability of an empty value being generated.

    The Computed Data Type gives you access to the metadata about fields in the row to let you generate whatever output you want based on that information. If you just need to access the generated string value from another field (i.e. what you see in the output), see the Composite Data Type. This field type gives you much more access to each field.

    , etc. contain everything available about that particular row. The content changes based on the row's Data Type and what has been generated, but high-level it contains the following properties:

    • - whatever options were entered in the interface/API call for the row
    • - any additional metadata returned for the Data Type
    • - the actual generated random content for this field (always in a "display" property) plus any other information about the generated content
    • - a handy JSON-serialization of everything in the row, so you can see what's available. Just run it through a JSON formatter.
    • - will output the gender ("male", "female" or "unknown") of the generated content of a Names Data Type field (be sure to replace "1" with the right row number!). If you used FemaleName as the placeholder string this variable will return "female" every time. If you entered "Name", the value returned will depend on the generated string. If you entered a placeholder string with multiple formats, it will return "unknown" if it contained both genders, or no genders (e.g. a surname without a first name).

    De-nied. In order to share this Data Set with other people, you need to save it first.

    I understand that to share this Data Set, I need to make it public.

    Email the user their login information

    Are you sure you want to delete this user account?

    First Name
    Last Name
    Email

    You have bundling/minification enabled. If you click the Reset Plugins button you will need to run grunt to recreate the bundles. For more information read this documentation page. If you have any problems, you may want to turn off bundling.

    About

    Ever needed custom formatted sample / test data, like, bad? Well, that's the idea of this script. It's a free, open source tool written in JavaScript, PHP and MySQL that lets you quickly generate large volumes of custom data in a variety of formats for use in testing software, populating databases, and. so on and so forth.

    This site offers an online demo where you're welcome to tinker around to get a sense of what the script does, what features it offers and how it works. Then, once you've whet your appetite, there's a free, fully functional, GNU-licensed version available for download. Alternatively, if you want to avoid the hassle of setting it up on your own server, you can donate $20 or more to get an account on this site, letting you generate up to 5,000 records at a time (instead of the maximum 100), and let you save your data sets. Click on the Donate tab for more information.

    Extend it

    The out-the-box script contains the sort of functionality you generally need. But nothing's ever complete - maybe you need to generate random esoteric math equations, pull random tweets or display random images from Flickr with the word "Red-backed vole" in the title. Who knows. Everyone's use-case is different.

    With this in mind, the new version of the script (3.0.0+) was designed to be fully extensible: developers can write their own Data Types to generate new types of random data, and even customize the Export Types - i.e. the format in which the data is output. For people interested in generating more accurate localized geographical data, they can add new Country plugins that supply region names (states, provinces, territories etc), city names and postal/zip code formats for their country of choice. For more information on all this, visit the Developer Documentation.

    Download

    Click the button below to download the latest version of the script from github. For more information see the User Documentation.

    Project News

    User Accounts

    This section lets you create any number of users accounts to allow people access to the script. Only you are able to create or delete accounts.

    No user accounts added yet.

    Donate now!

    If this has helped you in your work, a donation is always appreciated! If a general sense of do-goodery isn't enough to persuade you to donate, here are a few more material incentives:

    • Supporting the project leads to great new features! Honest!
    • Donating $20 or more will get you a user account on this website. With a user account you can:
      • Generate up to 10,000 rows at a time instead of the maximum 100.
      • Save your form configurations so you don't have to re-create your data sets every time you return to the site.

      Every $20 you donate adds a year to your account. You may return at a later date to add more time to your account - it will be added to the end of your current time. Just be sure to donate with the same email address. If you have any trouble donating or with your user account, just drop me a line.

      After donating, you will be emailed with details about how to finish setting up your account (check your spam folder!). If you have any problems, please contact me.


      Input Arguments

      Conn — Database connection connection object

      Database connection, specified as an ODBC connection object or JDBC connection object created using the database function.

      Sqlquery — SQL statement character vector | string scalar

      SQL statement, specified as a character vector or string scalar. The SQL statement can be any valid SQL statement, including nested queries. The SQL statement can be a stored procedure, such as . For stored procedures that return one or more result sets, use fetch function. For procedures that return output arguments, use runstoredprocedure .

      For information about the SQL query language, see the SQL Tutorial.

      Data Types: char | string

      Opts — Database import options SQLImportOptions object

      Database import options, specified as an SQLImportOptions object.

      Pstmt — SQL prepared statement SQLPreparedStatement object

      SQL prepared statement, specified as an SQLPreparedStatement object.

      Name-Value Pair Arguments

      Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1. NameN,ValueN .

      Example: results = fetch(conn,sqlquery,'MaxRows',50,'DataReturnFormat','structure') imports 50 rows of data as a structure.

      'MaxRows' — Maximum number of rows to return positive numeric scalar

      Maximum number of rows to return, specified as the comma-separated pair consisting of 'MaxRows' and a positive numeric scalar. By default, the fetch function returns all rows from the executed SQL query. Use this name-value pair argument to limit the number of rows imported into MATLAB ® .

      Example: 'MaxRows',10

      Data Types: double

      'DataReturnFormat' — Data return format 'table' (default) | 'cellarray' | 'numeric' | 'structure'

      Data return format, specified as the comma-separated pair consisting of 'DataReturnFormat' and one of these values:

      Use the 'DataReturnFormat' name-value pair argument to specify the data type of the result data results . To specify integer classes for numeric data, use the opts input argument.

      You can specify these values using character vectors or string scalars.

      Example: 'DataReturnFormat','cellarray' imports data as a cell array.

      'VariableNamingRule' — Variable naming rule "modify" (default) | "preserve"

      Variable naming rule, specified as the comma-separated pair consisting of 'VariableNamingRule' and one of these values:

      "modify" — Remove non-ASCII characters from variable names when the fetch function imports data.

      "preserve" — Preserve most variable names when the fetch function imports data. For details, see the Limitations section.

      Example: 'VariableNamingRule',"modify"

      Data Types: string


      We have updated our user interface. New documentation is coming soon. If you need help contact us here User Support

      PDX model identifier: A unique identifier assigned by the database management system to unambiguously identify a PDX model.

      Primary cancer site: The primary cancer site is the anatomical site of the cancer origin. More than one primary site can be selected for a search.

      Cancer type tags: Tags are used to group models that share clinical characteristics.

      Diagnosis: Cancer diagnoses are standardized using terms from the Disease Ontology (DO). More than one term can be selected for a search.

      PDX Dosing studies: PDXs that have been used in dosing studies can be searched by treatment and/or treatment responses. Treatment responses are based on modified RECIST criteria. Read more on dosing study design and interpretation here.

      Tumor mutation burden (TMB): Tumor mutation burden is a measurement of the number of mutations carried by tumor cells. TMB is potentially a predictive biomarker to identify tumors that are likely to respond to immunotherapy. In the JAX collection of PDXs, a score of 22 is considered high TMB. Read more about how TMB is calculated here.

      Gene fusion: Search for PDX models whose engrafted tumor harbors a gene fusion. Only gene fusions associated drug efficacy or cancer-related evidences are reported. Read more about the methods here.

      Gene variants: Search for PDX models whose engrafted human tumors harbor specific gene variants. Gene symbols must be official HGNC symbols. Once a gene symbol is specified, the variants/mutations observed in the PDX collection are displayed. More than one variant/mutation per gene can be selected. Genes that can be searched are restricted to those genes on the JAX Cancer Treatment Profile (CTP) gene panel. Read more about the methods and results here.

      Gene expression across PDX models: Displays a graphical summary of expression levels across PDX models for a gene. Only genes on the JAX CTP panel can be searched. Gene symbols must be official HGNC symbols. Read more about gene expression data here.

      Gene amplification/deletion across PDX models: Displays a graphical summary of gene expression across PDX models for a gene with the bars representing expression colored according to amplification/deletion status of the gene. Only genes on the JAX CTP panel can be searched. Gene symbols must be official HGNC symbols. Read more about copy number aberration data here.


      The full source code for the database handler above can be found here for reference:

      When working with SQLite, opening and inspecting the SQLite database can be helpful while debugging issues. You can leverage the Stetho library to view your data directly, or you can use the following command-line tools to retrieve the data.

      The commands below will show how to get at the data (whether running on an emulator or an actual device). The commands should be performed within the terminal or command-line. Once you have the data, there are desktop SQLite viewers such as DB Browser for SQLite or SQLite Professional to help inspect the SQLite data graphically.

      On an Emulator

      Use SQLite3 to query the data on the emulator:

      For further inspection, we can download the database file with:

      On a Device

      There isn't a SQLite3 executable on the device so our only option is to download the database file with:

      Using Device File Explorer

      You can go to View -> Tool Windows -> Device File Explorer and look inside the /data/<app package name>/databases and download the file locally. You can then use one of the previously mentioned SQLite desktop viewers.


      Watch the video: SQL - ΜΑΘΗΜΑ - ΔΗΜΙΟΥΡΓΙΑ ΒΑΣΗΣ ΔΕΔΟΜΕΝΩΝ - ΜΕΡΟΣ 1 από 6 - Δημιουργία Βάσης Δεδομένων (January 2023).