About FTGI

FTGI (Fast finding Three-way Gene Interactions) is a fast method for finding a three-way gene interaction,i.e. two interacting genes in expression
under the genotypes of another gene, given a dataset in which expressions and genotypes are measured at once for each individual.
The most suitable method for this issue is likelihood ratio test using logistic regressions, which we call interaction test,
but a serious problem of this test is computational intractability at a genome-wide level. FTGI prunes large part of input combinations,
to which interaction test does not have to be applied.

The following two cases are pruned in a main procedure of FTGI: 1) separable expression patterns in the two-dimensional plane of two genes,
which are classed by genotypes of a SNP, and 2) expression values are randomly distributed in terms of the classes.
The cases of 1) and 2) can be pruned by using linear discriminant analysis (LDA) and randomness test, respectively.
The randomness test is a combination of multivariate analysis of variance (MANOVA) and Box's M test,
and is also called by mean-covariance (MC) test.


Tutorial

Binary files of FTGI are available. 
A simple workflow of executing the binary file is: 
1. download the binary file for your platform.
2. prepare input files for genotype, expression, allele.
3. put the input files in the directory which includes the binary file.
4. execute the binary file from a command line on your platform (in the directory of 3.).
5. confirm the result in the output file.
The followings are a point-by-point tutorial for running the binary file.

Download The first step is to obtain a working copy of the binary file of FTGI and of the sample data files.
Platform File
Linux (x86_64) ftgi_linux.tar.gz
Apple Mac (i386, PPC) ftgi_mac.dmg
Microsoft Windows (Win32) ftgi_win.zip
Each of the compressed files includes a binary file ("ftgi"), sample files (geno.txt, exp.txt, allele.txt, out_sample.txt) and a readme file. The result of the sample input ("geno.txt", "exp.txt", "allele.txt") was outputted in "out_sample.txt". These input files include synthetic data of two SNPs and expression values of seven genes for 30 individuals.

Prepare data files Each of input files contains numerical or characteristic data which are written as a matrix formulation delimited by white spaces or tabs: data type size (#rows × #columns) delimiters genotype n × 2 #SNPs white space / tab expression #genes × n white space / tab allele #SNPs × 2 white space / tab Where n is the number of the individuals. Both white space and tab are available as delimiters of data. However, continuing spaces and continuing tabs are NOT ALLOWED. The followings are examples of the sample input. data type size delimiters genotype 30 × 4 white space expression 7 × 30 tab allele 2 × 2 white space Each line of the allele file includes major and minor alleles for a SNP. If possible alleles for a SNP are "A" and "G", the corresponding line in the allele file shows "A G" or "G A". The binary file permits a large number of SNPs, but the upper limit is a million.

How to run After installing the binary file of FTGI and also installing your data for genotypes, expressions and alleles in a working directory, the binary file is executed as follows. Note that the binary file and the data files should be in a common directory. Linux and Mac Please type the following command at a terminal window, >> ./ftgi -g geno.txt -e exp.txt -a allele.txt -o out.txt and enter a threshold value (for example 0.001) after the message, >> threshold of MC test's pvalue?: Windows Please type the following at a DOS prompt, >> ftgi -g geno.txt -e exp.txt -a allele.txt -o out.txt and enter a threshold of the p-value of MC test. The binary file first inputs the threshold value and data files, and assigns the output file with "out.txt". After that, FTGI will be performed though pruning input combinations by using LDA and MC test. The result is outputted in "out.txt". Options -g,-e,-a,-o specify input and output files. '-g' specifies a data file for genotypes "geno.txt". '-e' specifies a data file for expression values of genes "exp.txt". '-a' specifies a data file for alleles "allele.txt". '-o' specifies an output file "out.txt". These specifications are in no particular order. For example, "./ftgi -e exp.txt -g geno.txt -o out.txt -a allele.txt" is acceptable, but "./ftgi -a exp.txt -o geno.txt -g out.txt -e allele.txt" is NOT ACCEPTABLE. The following command is for a help of the usage. >> ./ftgi -h

Result The result will be printed in the output file. example -------------------------------------------------------------------- Results of FTGI: threshold = 0.001 SNP Gene1 Gene2 flag p-value(MC) likelihood(LDA) p-value(IT) 0 0 1 0 -0.246028 1.0 1.0 0 0 2 0 -0.824060 1.0 1.0 ... 1 5 6 1 -4.747279 -32.584872 -6.618216 -------------------------------------------------------------------- 1st column: SNP number (0, 1, ..., #SNPs-1) 2nd column: Gene1 number (0, 1, ..., #Genes-2) 3rd column: Gene2 number (1, 2, ..., #Genes-1) 4th column: flag for the prunings* (0, 1, 2) 5th column: log10(p-value) of MC test 6th column: log-likelihood value by LDA 7th column: log10(p-value) of the IT (Interaction Test) * : flag = 0 (pruned by MC test), =1 (not pruned), =2(pruned by LDA).

Return to supporting page.