Python for Consolidating Financial Data from Multiple Sybase Servers

Python can efficiently fetch and aggregate financial data from multiple Sybase database servers using a structured approach involving database connectors and data processing libraries. This enables automated comparison of cusip pricing across various vendor sources.

The Problem

Data engineers frequently need to consolidate specific financial instrument data, such as cusip pricing, from disparate vendor database systems. The manual process of querying each of the 3-4 Sybase servers individually using tools like Aqua Data Studio and then comparing results is time-consuming and prone to human error. An automated Python script is required to connect to all specified Sybase instances, execute a targeted query for a given cusip, and present the consolidated results.

The Solution

The following Python script uses pyodbc to connect to multiple Sybase databases and pandas to aggregate the query results. Ensure that the appropriate Sybase ODBC driver is installed on the system where the script will run.

import pyodbc
import pandas as pd
import sys

# --- Configuration ---
# Define connection details for each Sybase server.
# Replace placeholders with actual connection strings, usernames, and passwords.
DATABASE_CONFIGS = [
    {
        'name': 'VendorA_Sybase',
        'connection_string': 'DRIVER={Sybase ASE ODBC Driver};SERVER=sybase_server_a;PORT=5000;DATABASE=db_a;',
        'user': 'your_user_a',
        'password': 'your_password_a'
    },
    {
        'name': 'VendorB_Sybase',
        'connection_string': 'DRIVER={Sybase ASE ODBC Driver};SERVER=sybase_server_b;PORT=5001;DATABASE=db_b;',
        'user': 'your_user_b',
        'password': 'your_password_b'
    },
    {
        'name': 'VendorC_Sybase',
        'connection_string': 'DRIVER={Sybase ASE ODBC Driver};SERVER=sybase_server_c;PORT=5002;DATABASE=db_c;',
        'user': 'your_user_c',
        'password': 'your_password_c'
    }
]

def fetch_cusip_data_from_db(db_config: dict, cusip_code: str) -> pd.DataFrame | None:
    """
    Connects to a single Sybase database and fetches pricing data for a given CUSIP.

    Args:
        db_config: Dictionary containing database connection details.
        cusip_code: The CUSIP code to query.

    Returns:
        A pandas DataFrame with cusip pricing data or None if an error occurs.
    """
    conn = None
    try:
        # Establish connection using pyodbc
        conn = pyodbc.connect(
            db_config['connection_string'],
            uid=db_config['user'],
            pwd=db_config['password']
        )
        cursor = conn.cursor()

        # Parameterized query to prevent SQL injection
        query = """
            SELECT
                cusip_column,
                price_column,
                last_updated_date
            FROM
                your_pricing_table
            WHERE
                cusip_column = ?
            ORDER BY
                last_updated_date DESC
            """

        # Execute the query with the cusip_code parameter
        cursor.execute(query, cusip_code)

        # Fetch all results
        rows = cursor.fetchall()

        # Get column names from cursor description
        columns = [column[0] for column in cursor.description]

        # Create a DataFrame
        df = pd.DataFrame(rows, columns=columns)
        df['SourceDB'] = db_config['name'] # Add a column to identify the source database

        return df

    except pyodbc.Error as ex:
        sqlstate = ex.args[0]
        print(f"Error connecting or querying {db_config['name']} (SQLSTATE: {sqlstate}): {ex}", file=sys.stderr)
        return None
    except Exception as e:
        print(f"An unexpected error occurred for {db_config['name']}: {e}", file=sys.stderr)
        return None
    finally:
        if conn:
            conn.close()

def consolidate_cusip_pricing(cusip: str) -> pd.DataFrame:
    """
    Fetches and consolidates cusip pricing data from all configured Sybase databases.

    Args:
        cusip: The CUSIP code to search for across all databases.

    Returns:
        A pandas DataFrame containing aggregated pricing data from all sources.
    """
    all_results = []
    for db_config in DATABASE_CONFIGS:
        print(f"Attempting to fetch data for CUSIP '{cusip}' from {db_config['name']}...")
        df = fetch_cusip_data_from_db(db_config, cusip)
        if df is not None and not df.empty:
            all_results.append(df)
        elif df is not None and df.empty:
            print(f"No data found for CUSIP '{cusip}' in {db_config['name']}.")
        else:
            print(f"Failed to fetch data from {db_config['name']}.")

    if not all_results:
        print(f"No data found for CUSIP '{cusip}' across any configured databases.")
        return pd.DataFrame() # Return an empty DataFrame

    # Concatenate all DataFrames into a single result
    consolidated_df = pd.concat(all_results, ignore_index=True)
    return consolidated_df

# --- Example Usage ---
if __name__ == "__main__":
    target_cusip = "123456ABC" # Replace with your actual CUSIP

    # Fetch and consolidate data
    final_data = consolidate_cusip_pricing(target_cusip)

    if not final_data.empty:
        print(f"\n--- Consolidated Pricing Data for CUSIP: {target_cusip} ---")
        print(final_data.to_string()) # Use to_string() for full DataFrame output
        # Further processing or comparison logic can be added here
    else:
        print(f"\nNo consolidated data available for CUSIP: {target_cusip}.")

Why It Works

Modular Configuration: Database connection details are stored in a list of dictionaries, allowing easy addition or modification of server configurations without altering the core logic. This promotes maintainability and scalability.
pyodbc for Database Connectivity: The pyodbc library provides a standard way to connect to various ODBC-compliant databases, including Sybase, from Python. It acts as a wrapper around the native Sybase ODBC driver, enabling efficient communication.
Parameterized Queries: Using ? as a placeholder for the cusip_code in the SQL query and passing the value as a separate argument to cursor.execute() prevents SQL injection vulnerabilities, a critical security practice.
Error Handling: try...except pyodbc.Error blocks are implemented to catch specific database errors (e.g., connection issues, query failures) and try...except Exception for broader unexpected issues, providing informative feedback without crashing the script.
pandas for Data Aggregation: pandas.DataFrame is used to structure the data fetched from each database. The pd.concat() function efficiently combines these individual DataFrames into a single, comprehensive DataFrame, simplifying further analysis and display.
Source Identification: An explicit SourceDB column is added to each DataFrame before concatenation. This is crucial for distinguishing which vendor or server contributed each row of data, enabling direct comparison of cusip values across sources.

Python for Consolidating Financial Data from Multiple Sybase Servers

The Problem

The Solution

Why It Works

Reference

Leave a Comment Cancel Reply