GATACA is developing novel algorithms and software solutions for virologists. Funded by several Small Business Innovation Research (SBIR) awards from the NSF and NIH (NIAID), our current focus is on the hepatitis B virus (HBV) bioinformatics. We have developed an algorithm for assembling deep coverage NGS data using a novel contig merge-refine-cluster iteration approach for retaining all suspected true variants, resolving haplotypes and characterizing quasispecies. Our pipeline for the hepatitis C virus (HCV) incorporates resistance profiling and mutation tracking/variant calling algorithms directly from NGS reads.

Our Pipelines

Harness the full viral mutation spectra from NGS data

Assembly Tool for HBV genomes

Resistance Profiling of HCV genomes

  • Pipeline comprises algorithms for mutation tracking, drug resistance profiling, heterogeneity quantification, and variant calling (including DAAs)
  • Genotyping directly from NGS reads:
    • Performs rapidly and accurately on datasets with ≥1,000,000 reads and as few as 1,000 reads
    • Quantifies the fraction of each subtype present in the NGS dataset

Data Quality and Variant Analysis

  • QC Error correction
  • Extraction of contamination and redundancy
  • Genotype and SNP Calling
  • Drug Resistance Screening
  • Mutation Signature ID
  • Annotation
  • Resistance Profiling
  • File and sequence platform concordance
  • Other NGS statistics

Sample Outputs

  • NGS file quality

  • Read quality distribution

  • Sample concordance

  • Mutation concordance

  • Variant analysis

  • Mutation signatures

  • Mutation reports

Our Features

Take control of your virology research and gain more insight faster

  • Database Metrics
  • Mutation capture
  • Genome-specific trait ID
  • Dynamic databases
  • Integration of workflows

Data Metrics

Baseline quality controls allow removal of unwanted reads (e.g., too short, low quality bases); error correction, extraction of redundant and contaminating reads (e.g., microbial, human). The resulting dataset includes only the viral data, thus streamlining down-stream analytics.

Mutation Capture

Resolving quasispecies composition is a daunting challenge in viral informatics. Our proprietary assembly algorithm utilizes an iterative scheme to produce contigs and interconnections highly representative of intra-host variation, including long-range and in the low frequency realm. Alignments enable comparison of output to our dynamic databases with up-to-date resistance mutations, including DAAs.

Trait Identification

Key protein coding regions are critical to understanding the specifics of viral structure and mutation potential. Our methods automatically segregate and identify these regions with unprecidented rapidity.

Dynamic Databases

Implementing a dynamic design has enabled us to overcome the challenges associated with static database design; our databases are scalable and flexible, require minimal data reconstruction, and are updated regularly with emerging variant information (resistance, replacement, escape, etc).

Integrate Workflow and Experiments

The key to a solid data management system is integrating the workflow, such as attaching annotations to sequences, creating data linkages, extracting sequence-specific features for simplifying and unifying your data input with automation; discerning batch effects and meaningful offsets, and unmasking hidden patterns and trends can be ascertained by integrating the results of multiple experiments with proper normalization and cross-compiling.


The Forum for Collaborative Research

HHS Viral Hepatitis Action Plan

Center for the Study of Hepatitis C

Center for Innovative Technology

National Center for Biotechnology Information

The European Bioinformatics Institute (EMBL-EBI)

Swiss Institute of Bioinformatics

Rhône-Alpes Bioinformatics Center

Additional Offerings

Custom Designed Analysis Tools

Strategies and algorithms custom designed for small complex genomes.

  • Unique data management and analytic methods that resolve high genetic heterogeneity and allow longitudinal analyses to track mutations and viral evolution
  • Custom Analytic Reports
  • Custom design of algorithms for other small complex genomes
  • Customize existing and new algorithms for existing and emerging technical outputs, including:
    -Third-gen sequencing
    -Gene expression (RNAseq)
    -Protein expression (Cytof)

Editing / Writing Services

Our team of skilled writers and field-matched bioinformatics experts will advise on the strategy, structure, and creation of your scientific documents and presentation materials. Our team can help you maximize the impact of your work by creating clear, concise, and powerful publication- and presentation-ready material. Services are offered at varying levels to match individual needs and a work plan will be created to match.

  • Manuscripts
  • Grant Preparation
  • Creation of SOPs
  • Research Reports
  • PowerPoints
  • Pamphlets and Brochures
  • Custom Educational & Promotional Materials
  • Services of value to both native and non-native English speakers

Study Design Consultation

Our team has extensive experience working with, educating, and advocating along-side priority populations identified in the Viral Hepatitis Action Plan. We are available and interested in partnering with research teams looking to advance the Action Plan by decreasing health disparities through education, awareness, and improved access to care.

  • Health Disparities Research Study Design
  • Grant Writing
  • Public Education and Advocacy
  • Community Awareness