[100-Day AI bootcamp] Day 8: Financial Data Assistant

Posted by xkuang on February 7, 2025

https://github.com/xkuang/financial-data-assistant

5311738902216_ pic

Development Tools:

  • Code Development: Cursor - The AI-first code editor
  • AI Assistance: Claude (Anthropic) - For rapid development and problem-solving

A comprehensive financial data platform that aggregates and analyzes data from FDIC (Federal Deposit Insurance Corporation) and NCUA (National Credit Union Administration). The platform includes an automated data pipeline using Apache Airflow and a web interface with SQL querying capabilities.

5331738902295_ pic

Features

 

  • Automated data collection from FDIC and NCUA sources
  • Daily data freshness checks and updates
  • Interactive SQL query interface
  • Pre-built analytics queries
  • AI-powered chat interface for data exploration
  • Comprehensive documentation and API reference

System Requirements

 

  • Python 3.11+
  • PostgreSQL 13+
  • Redis (for Airflow)
  • Virtual environment management tool (venv recommended)

Directory Structure

 

.
├── airflow/
│   ├── dags/
│   │   ├── refresh_financial_data_dag.py
│   │   ├── fdic_ingestion.py
│   │   └── ncua_ingestion.py
├── docs/
│   ├── architecture.md
│   ├── data_mechanics.md
│   └── technical_overview.md
├── prospects/
│   ├── models.py
│   ├── views.py
│   └── urls.py
├── templates/
│   ├── base.html
│   ├── chat/
│   └── documentation/
└── manage.py

Installation

 

  1. Clone the repository:
git clone https://github.com/yourusername/financial-data-assistant.git
cd financial-data-assistant
  1. Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install required packages:
pip install -r requirements.txt
  1. Set up environment variables (create .env file):
# Django settings
DEBUG=True
SECRET_KEY=your_secret_key_here
ALLOWED_HOSTS=localhost,127.0.0.1

# Database settings
DATABASE_URL=postgresql://user:password@localhost:5432/financial_db

# Airflow settings
AIRFLOW_HOME=/path/to/your/airflow
AIRFLOW__CORE__SQL_ALCHEMY_CONN=sqlite:////path/to/your/airflow/airflow.db
AIRFLOW__CORE__LOAD_EXAMPLES=False
  1. Initialize the database:
python manage.py migrate
python manage.py createsuperuser

Running the Application

 

1. Start Airflow Services

 

Initialize Airflow database (first time only):

airflow db init

Create Airflow user (first time only):

airflow users create \
    --username admin \
    --firstname Admin \
    --lastname User \
    --role Admin \
    --email admin@example.com \
    --password admin

Start Airflow services:

# Start the web server (in a separate terminal)
airflow webserver -p 8081

# Start the scheduler (in another separate terminal)
airflow scheduler

Airflow UI will be available at: http://localhost:8081

2. Start Django Development Server

 

python manage.py runserver 8000

The web application will be available at: http://localhost:8000

Available URLs

 

Data Pipeline

 

The Airflow DAG (refresh_financial_data) runs daily at 6 AM and performs the following tasks:

  1. Checks FDIC website for new data
  2. Checks NCUA website for new data
  3. Generates a freshness report
  4. Updates database if new data is available

Development

Running Tests

python manage.py test

API Documentation

The application provides several API endpoints for data access:

  • /api/institutions/: List of financial institutions
  • /api/stats/: Quarterly statistics
  • /chat/query/: Natural language query endpoint
  • /chat/sql/: SQL query endpoint

Acknowledgments

  • FDIC for providing financial institution data
  • NCUA for providing credit union data
  • Apache Airflow team for the amazing workflow management platform
  • Django team for the robust web framework

Learning Resources

Interview Preparation Courses

Looking to level up your interview skills? Check out these comprehensive courses:

Essential Reading

  • Designing Data-Intensive Applications by Martin Kleppmann - The definitive guide to building reliable, scalable, and maintainable systems. A must-read for understanding the principles behind modern data systems.

Hosting Solutions

For deploying your own instance of this project, consider these reliable hosting options:

Comments

Please log in to leave a comment.

No comments yet.