Query Acceleration for Big Data


Algebraix Query Accelerator for  Apache Spark

The Algebraix Query Accelerator (AQA) is a software component for Spark SQL that lets you automatically provision computations of Spark SQL’s directed acyclic graph. AQA leverages patented inter-query reuse technology to improve performance and reduce cloud infrastructure costs.

By applying AQA to the Spark framework, developers and data scientists can use less expensive resources, fewer nodes, and shorten processing times to save total cost of ownership.

Whereas most SQL optimization techniques are focused on establishing adjacent data stores, AQA optimizes the actual query execution plans from Spark’s catalyst. Our software uses Data Algebra to cache a variety of equivalent opportunities and subsequently removes work from Spark’s SQL jobs while maintaining the correct end computations.

AQA is a simple to install software package that works in conjunction with Amazon Web Services, Elastic Map Reduce, and Amazon’s S3 filesystem. The application of our product requires no change to your current Spark scripts or queries.

The initial version of AQA runs alongside Apache Spark to improve SQL performance and user concurrency in that environment; however AQA is being developed for other databases and big data cloud environments to include Microsoft Azure and IBM Bluemix.



The Benefits of AQA

Improved SQL Query Performance 

AQA’s inter-query optimization approach speeds up SQL query performance by as much as 10-1000x. Even more importantly, SQL query performance improves exponentially over time. As more users submit more SQL queries, the amount of query repetition, in whole or in part, increases. Thus, AQA effectively gets smarter over time and delivers ever-increasing SQL
performance results. 

Improved Multi-User Concurrency

As the number of users submitting SQL queries increases, the resulting performance levels traditionally drop, as users are competing for computational resources. Concurrency in big data environments  is a compounding issue and will become a much larger issue as the big data landscape evolves. AQA’s approach helps to solve this problem by speeding SQL query times. In fact, each user’s queries actually benefit from other users.

Reduced Operational Costs

Instead of caching data to deliver the required SQL response times, AQA instead caches computations. As a result, the software effectively substitutes storage costs for compute costs driving down net operational costs.

Our Core Technology: Data Algebra ®

Our core technology, Data Algebra®, is a mathematical approach to manipulating and representing data. Whereas other technologies leverage meta-data or adjacent data stores to process data, data algebra translates queries into simple algebraic lookups. 

Our book The Algebra of Data: A Foundation for the Data Economy – is an introduction to this genuine game changing technology concept. The book was co-written by Gary J. Sherman, PhD, the inventor of the Algebra of Data™ and founding mathematician of Algebraix Data, and Robin Bloor, PhD, also a mathematician as well as an influential researcher, analyst, and well-known author.

Information week and several other editorials have noted, “Data algebra is a new approach for managing, integrating, and searching data faster and more efficiently”. Download our E book to learn more.


Our Technology Patents

The Algebraix Technology Platform is based on our fundamental innovation in the field of applied mathematics: the algebra of data. The company is building a portfolio of patents around its technology platform. We currently hold nine U.S. patents and expect to receive dozens more.

U.S. Patents Granted to Date
  • 7613734 Systems and Methods for Providing Data Sets using a Store of Algebraic Relations
  • 7720806 Systems and Methods for Data Manipulation using Multiple Storage Formats
  • 7769754 Systems and Methods for Data Storage and Retrieval using Algebraic Optimization
  • 7797319 Systems and Methods for Data Model Mapping
  • 7865503 Systems and Methods for Data Storage and Retrieval using Virtual Data Sets
  • 7877370 Systems and Methods for Data Storage and Retrieval using Algebraic Relations
  • 8032509 Systems and Methods for Data Storage and Retrieval using Algebraic Relations Composed from Query Language Statements
  • 8380695 Systems and Methods for Data Storage and Retrieval Using Algebraix Relations to Optimize Calculations
  • 8583687 Systems and Methods for Indirect Algebraic Partitioning