INTERNSHIP DETAILS

Data Engineering Internship

CompanyAuditoria.AI
LocationSanta Clara
Work ModeRemote
PostedApril 17, 2026
Internship Information
Core Responsibilities
You will build and maintain data ingestion pipelines and transformation processes to support AI-native enterprise SaaS platforms. This includes structuring data for LLM readiness, implementing data quality monitoring, and optimizing warehouse performance.
Internship Type
intern
Company Size
277
Visa Sponsorship
No
Language
English
Working Hours
40 hours
Apply Now →

You'll be redirected to
the company's application page

About The Company
We help finance teams work faster, reduce manual effort, and improve cash performance using agentic AI built specifically for finance. Our platform automates critical workflows across accounts payable, accounts receivable, vendor management, and the general ledger. By taking work out of email, spreadsheets, and exceptions, we help teams close faster and operate with greater accuracy. Our AI Agents connect directly to systems of record and shared accounting inboxes. They understand finance-specific requests and take action—such as responding to inquiries, processing invoices, and advancing collections—without relying on rigid rules or manual intervention. Behind the scenes, we combine a proprietary finance-specialized language model with leading large language models to deliver secure, enterprise-ready AI. This approach improves accuracy while fitting cleanly into existing ERP environments, so teams get value quickly without heavy IT involvement. We remove friction from high-volume, exception-driven processes and provide real-time visibility into cash performance. Customers use our platform to reduce workload and attrition, strengthen controls, and make faster, more informed decisions. We integrate seamlessly with ERPs and shared inboxes to engage suppliers, customers, and internal teams. Our agentic AI automates collections, payment processing, vendor management, procurement controls, and finance inquiries—while continuously generating structured data that supports forecasting and performance insights. Our products allow finance teams to reclaim thousands of hours—so they can refocus on higher-value work.
About the Role

About the Role

We're scaling an AI-native enterprise SaaS platform that powers agentic automation for corporate finance teams at Fortune 500 companies. As a Data Engineering Intern, you'll build the data infrastructure that makes our agents work.  Clean, well-modeled, LLM-ready data flowing from customer ERPs into Snowflake, through our semantic layer, and into the retrieval pipelines that ground every decision our agents make.

You'll work across the modern data stack and implement medallion architecture patterns that serve both operational systems and AI/ML workloads.

Key Responsibilities

  • Building ingestion pipelines from customer ERPs and finance systems into data warehouse
  • Writing transformations in our Bronze, Silver, Gold medallion architecture, with an eye toward making data LLM-ready: well-named, well-typed, well-documented, and semantically meaningful
  • Extending the semantic layer that powers natural-language analytics, this is what lets non-technical finance users ask questions and get grounded answers
  • Preparing and structuring data for retrieval, embeddings, vector search, and context assembly for RAG pipelines that feed our agents
  • Implementing data quality checks, lineage, and monitoring so agents never act on bad data
  • Tuning queries and warehouse usage for both cost and latency
  • Contributing to technical documentation and participating in code reviews


Qualifications

  • Pursuing (or recently graduated) a Bachelor's or Master's in Computer Science, Data Engineering, Statistics, or a related field
  • Solid SQL skills: joins, window functions, and a basic grasp of how to read a query plan
  • Hands-on experience with at least one relational database (MySQL, Postgres, or similar) through coursework, projects, or prior internships
  • Comfortable writing Python for data processing and scripting
  • Genuine interest in LLMs and AI systems, you've played with OpenAI/Anthropic APIs, built a RAG project, or thought seriously about how data shape affects model behavior
  • Excellent communication, you can explain what you built and why
  • Must be currently authorized to work in the United States without employer sponsorship, as we are unable to sponsor or transfer visas for this position
  • Must be located in or within commuting distance of Santa Clara, CA to be considered


Preferred Qualifications

  • A graduation date of 2026 or late 2025
  • Exposure to Snowflake, BigQuery, or Databricks
  • Experience with dbt, Airflow, or another orchestration/transformation tool
  • Experience with vector databases (Pinecone, Weaviate, pgvector, Snowflake Cortex Search) or embedding workflows
  • Understanding of dimensional modeling (star/snowflake schemas)
  • Any prior internship or substantive personal project in data engineering
  • Authorized to work in the United States without the need for future sponsorship
Key Skills
SQLPythonData EngineeringSnowflakeETL PipelinesMedallion ArchitectureLLMsRAGData ModelingVector DatabasesDatabase ManagementData QualityAPI IntegrationCommunicationTechnical Documentation
Categories
Data & AnalyticsSoftwareTechnologyEngineeringFinance & Accounting