Amazon Web Services is a cloud computing platform that allows for easy and secure procurement of cloud infrastructure and storage. We have been using AWS since June 2015 to host our internal infrastructure as well as safely hosting our client data.
Oracle is an industry leading relational and transactional database platform that is proven to be fast and reliable. We've been using Oracle since Sagacity launched and it is fully embedded in the business with much of our proprietary software being built on it.
Python is an open source, general purpose programming language that enforces code readability as a principle that allows for fast development. We have been using Python since 2016 for simple and fast loading of data for our Oracle instances. We also use it to create our proprietary algorithms and packages in our Customer Data Solutions as well as for communication with Apache Spark via the pyspark package.
Apache Spark is an open source platform that enables us to process large amounts of data in a very short space of time by parallel running jobs. It is a focus area of our development activities due to its incredibly fast processing times. We've amassed a wealth of knowledge and experience through various projects and solutions including data warehouse solutions using a combination of Apache Spark and Apache Delta, reconciliation assurance projects, and processing terrabytes of data for our Value Based Management solutions.
Databricks provides a useful front end environment and optimisations in addition to Apache Spark. We have been actively using it since October 2017 to help us develop and analyse data. Databricks' automated jobs are used extensively for productionised jobs as a fast way of deploying our applications onto a cluster. Our analysts also use Databricks to investigate data in an interactive manner.
Microsoft Azure is a cloud computing platform that we operate as an alternative to AWS where this is requested by our clients.
Apache Hadoop (HDFS) is an open source fault tolerant distributed file system that we use when handling large scale data sets from our clients. It enables us to ensure that data is readily available to meet our clients needs.
Scala is a programming language that runs on the Java Virtual Machine (JVM) that focuses on concurrency and scalability. It is the language that Apache Spark is built in. We use Scala to interact with Apache Spark for big data projects where performance is critical and where there are particularly complex data processing requirements.
Kubernetes is an open source platform that we use, when required, to automate application deployment scaling and management. We are currently leveraging it to develop platforms and tools to enhance our performance for future client projects.