Introduction
As an EthIndia fellow, I had the opportunity to work on an exciting project called ETLA. ETLA is a platform designed to supercharge web3 application backends, making it easier for decentralized applications (dApps) to access and manage their data. In this blog post, I'll share my experiences working on the ETLA indexer and the kinesis architecture and how they contribute to the overall ETLA ecosystem.
What is ETLA?
ETLA is a tool that aims to simplify data management for web3 applications. It allows developers to create data pipelines and connect them to database instances with custom logic without setting up and maintaining their own backend infrastructure. With ETLA, web3 firms can focus on building and scaling their applications while ETLA handles their data needs.
ETLA Architecture Overview
The ETLA architecture consists of three main components: the ETLA indexer, Amazon S3, and Amazon Kinesis. Here's a high-level overview of how these components work together to provide efficient data management for web3 applications:
ETLA Indexer: The ETLA indexer fetches data from blockchain nodes and other sources, transforming it into a format suitable for further processing and storage. The indexer is designed to handle vast data generated by web3 applications in real time.
Amazon S3: Once the ETLA indexer processes the data, it is pushed to an Amazon S3 bucket for storage. Amazon S3 is a scalable and reliable object storage service that allows for easy retrieval and management of stored data.
Amazon Kinesis: Kinesis is a managed service that enables real-time data processing and analysis at scale. ETLA utilizes Kinesis to process and analyze data from the S3 bucket based on user-defined logic.
The ETLA architecture provides several benefits for web3 applications:
Scalability: The architecture can handle large volumes of data and scale to accommodate growing data needs.
Real-time processing: Data is processed and analyzed in real-time, enabling web3 applications to make timely decisions based on the latest information.
Custom logic: Users can define their processing logic, giving them complete control over how their data is transformed and analyzed.
Cost efficiency: ETLA architecture leverages managed services, reducing the need for dedicated backend infrastructure and minimizing operational costs.
How does the ETLA Architecture Work?
Here's a step-by-step breakdown of the data flow in the ETLA architecture:
The ETLA indexer fetches raw data from blockchain nodes or other sources.
The indexer processes the data and stores it in an Amazon S3 bucket.
Kinesis Data Streams ingest block numbers corresponding to the stored data.
The block numbers in the Kinesis Data Streams trigger AWS Lambda functions. Each Lambda function fetches the corresponding block data from the S3 bucket.
The Lambda functions to process the block data based on user-defined logic, such as filtering, aggregation, or transformation.
Processed data is sent to the appropriate storage destination or returned to the application for further use.
The ETLA Indexer
One of the critical components of ETLA is the indexer. The indexer is responsible for fetching data from various sources, such as blockchain nodes or external APIs, and transforming it into a format easily consumed by the application.
As a fellow, I worked on building the ETLA indexer from the ground up. This involved designing a system that could handle the vast amount of data generated by web3 applications and process it in real time. The indexer must be highly scalable and efficient, handling various data types and formats.
In the first week, I looked into all the existing indexer’s code, but no one was fast enough to manage our latency needs.
The next few weeks went into degen mode research, trying to find a good data source and working on ETLA’s architecture. Found some exciting projects like Ethereum-etl, true blocks, etc.
Then I moved into the next phase, the development, and started building ETLA in-house indexer. Currently, it is written in typescript (but it can also be migrated into other languages).
Building an indexer was no rocket science, but integrating it into existing data stack of ETLA is a different task. The data should be consistent without any package loss. At the start, we were experiencing some package loss to the synchronicity of the typescript.
We focused on implementing a promise queue with in-house algorithms to tackle this problem. We tackled this problem and moved on to the next development phase, building the backend for ETLA.
PS: The current ETLA indexer is on par with etherscan (or maybe faster 😉)
Learnings and challenges
I learned much about the web3 space through the fellowship and explored many products and valuable tools. The fellowship also allowed me to work on a real-world project and implement my knowledge in practice.
One of my biggest challenges was working with AWS (especially EC2). Further, I had to consider different options for sourcing blockchain data and choose the most efficient method for the ETLA system. Another challenge was building the UI, as I had limited experience in front-end development.
Conclusion
The ETLA architecture offers an efficient and scalable solution for data management in web3 applications. By leveraging Amazon S3 and Kinesis, ETLA provides real-time data processing and analysis capabilities, enabling developers to focus on building innovative web3 applications without worrying about backend infrastructure. As the web3 ecosystem grows, ETLA's architecture is well-positioned to meet the increasing data demands of decentralized applications.
The Eth India Fellowship'22 was an excellent opportunity to learn and grow in web3. I built a working MVP for ETLA in just eight weeks, which is an outstanding achievement. Throughout the fellowship, I faced several challenges and obstacles but was able to overcome them through hard work, perseverance, and the guidance of my mentor.
I learned a lot about AWS tools and architecture, data warehousing, UI design, and connecting the front and back ends of a web3 product. Overall, the Eth India Fellowship'22 was a great learning experience, and I am excited to continue building and contributing to the web3 ecosystem.
Kudos to the team devfolio, and Huge thanks to my mentor Thrilok!