Amazon Web Services is a cloud computing platform that allows for easy and secure procurement of cloud infrastructure and storage. We have been using AWS since June 2015 to host our internal infrastructure as well as safely hosting our client data.
Oracle is an industry leading relational and transactional database platform that is proven to be fast and reliable. We've been using Oracle since Sagacity launched and it is fully embedded in the business with much of our proprietary software being built on it.
Python is an open source, general purpose programming language that enforces code readability as a principle that allows for fast development. We have been using Python since 2016 for simple and fast loading of data for our Oracle instances. We also use it to create our proprietary algorithms and packages in our Customer Data Solutions as well as for communication with Apache Spark via the pyspark package.
Apache Spark is an open source platform that enables us to process large amounts of data in a very short space of time by parallel running jobs. It is a focus area of our development activities due to its incredibly fast processing times. We've amassed a wealth of knowledge and experience through various projects and solutions including data warehouse solutions using a combination of Apache Spark and Apache Delta, reconciliation assurance projects, and processing terrabytes of data for our Value Based Management solutions.
Databricks provides a useful front end environment and optimisations in addition to Apache Spark. We have been actively using it since October 2017 to help us develop and analyse data. Databricks' automated jobs are used extensively for productionised jobs as a fast way of deploying our applications onto a cluster. Our analysts also use Databricks to investigate data in an interactive manner.
Microsoft Azure is a cloud computing platform that we operate as an alternative to AWS where this is requested by our clients.
Apache Hadoop (HDFS) is an open source fault tolerant distributed file system that we use when handling large scale data sets from our clients. It enables us to ensure that data is readily available to meet our clients needs.
Scala is a programming language that runs on the Java Virtual Machine (JVM) that focuses on concurrency and scalability. It is the language that Apache Spark is built in. We use Scala to interact with Apache Spark for big data projects where performance is critical and where there are particularly complex data processing requirements.
Kubernetes is an open source platform that we use, when required, to automate application deployment scaling and management. We are currently leveraging it to develop platforms and tools to enhance our performance for future client projects.
SQL Server is a relational database management system, developed by Microsoft. SQL Server supports a wide variety of transaction processing, business intelligence, and analytics applications and is currently one of the market-leading database technologies. Our skilled developers utilise SQL Server to extract and process data to help our clients manage their data effectively.
Pandas is a fast, powerful, and flexible open-source data analysis and manipulation tool, built on the Python programming language. It offers data structures and operations for manipulating numerical tables and time series. We have been actively using the software since 2016 and have built up an extensive knowledge of using Pandas’ data frames for client use cases including; data exploration and identifying trends in smaller data sets.
Power BI is a Microsoft business analytics platform that contains a collection of software services, apps, and connectors that work together to turn unrelated sources of data into coherent, visually immersive, and interactive insights. We use Power BI to create dashboards, which enables us to transform our clients’ data, making it useable and understandable. This allows them to work with the data and track performance against KPIs, spot trends and inform decisioning.
Power Query is a business intelligence tool and data preparation engine available in Microsoft Excel that allows the import, cleansing, transformation and reshaping of data from many different sources. Power Query is a powerful tool that our analysts use to simplify the process of importing data from different source files and manipulate data for analytical purposes.
Qlik is an end-to-end software vendor specialising in data visualisation, executive dashboards, and self-service business intelligence. We have used Qlik for over ten years for the development of client dashboards. Qlik was named a Leader in the 2021 Gartner Magic Quadrant for Analytics and Business Intelligence Platforms.
AWS Lambda is an event-driven, serverless computing platform that is a part of Amazon Web Services. It is a computing service that runs code in response to events and automatically manages the computing resources required by that code. Our skilled analysts use AWS Lambda for the management and installation of EC2 and RDS instances through a Slack interface.
Structured Query Language, commonly known as SQL, is a standard programming language for relational databases and is one of the most widely implemented database languages. We use SQL to analyse our clients’ data to maximise its potential. It can be used in various formats, such as Oracle PL-SQL, Transact SQL for SQL Server, and Spark SQL for Apache Spark.Node.js.
Keras is a deep learning API written in Python for programming neural networks, which was developed with a focus on enabling fast experimentation. We use Keras for our machine learning program pipelines, to enable an independently executable workflow of a complete machine learning task.
TensorFlow is an end-to-end open-source platform for machine learning program pipelines. We have used TensorFlow on numerous projects, most notably to determine the root cause of agent credit note adjustments from the adjustment notes for debt improvement activities.
We use MLflow, an open-source platform to manage the machine learning lifecycle including experimentation, reproducibility, deployment and for a central model registry. We mainly use MLflow for machine learning models in Value Based Management products, integrating with Databricks to experiment, save and run predictive models.
Scikit-learn is a software machine learning library for the Python programming language. We have been using it for over three years to both enhance our machine learning models and to solve complex machine learning problems that we encounter. It provides a range of supervised and unsupervised learning algorithms via a consistent interface in Python.
Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently. It can manage existing and popular service providers as well as custom in-house solutions. We utilise Terraform for the efficient management of our architecture in AWS.