Jobs    Everything

Select a Metro Area

Apply for this Job below or Call Us at 305-651-6500

Applied AI Engineer

REMOTE, Florida

Apply Now

Job Title: Applied AI Engineer
Location: Remote (Florida based company)
Job Type: Contract and Perm Full-Time
Salary: $120K to $140K + Solid Benefits
Job #: 7365

If you believe coding with AI isn’t good, then please do not apply to this position… otherwise:

Help us build the future of automation with AI at its core. If you care about shipping real products, solving hard problems with large language models, and building platform capabilities that help others move faster, this is the role for you.

We’re hiring across multiple teams, each with its own focus area in applied AI. Depending on the team, you may work on shared SDKs, evaluation and benchmarking systems, orchestration frameworks, retrieval infrastructure, guardrails, or user-facing AI features. What these teams share is a common operating model: you will ship to production, own meaningful problems end-to-end, and deliver impact across the organization.

Even if this description feels tailored to a very specific candidate, the reality is that the role often adapts to the strengths and experience of the person who joins. If you meet the core criteria below, we encourage you to apply.

About You

Core engineering background

  • 5+ years in software engineering, including 3+ years building distributed, cloud-native services (e.g., microservices, event-driven systems, asynchronous workers, API gateways).

  • Hands-on experience with service reliability and performance: profiling, latency budgets, throughput tuning, backpressure, rate limiting, caching, and resilience patterns (timeouts, retries, idempotency, circuit breakers).

  • Comfortable operating production systems with observability: structured logging, metrics, tracing (OpenTelemetry), dashboards, and on-call style debugging.

Applied LLM experience (production-grade)

  • 1+ year deploying LLM-powered features in production, including prompt design, tool/function calling, and multi-step workflows that must be reliable under real user traffic.

  • Experience with agentic architectures: planners/executors, tool routers, memory management, and guardrailed action execution (e.g., constrained tool schemas, sandboxing, deterministic fallbacks).

  • Familiarity with model behavior and failure modes: hallucinations, instruction hierarchy conflicts, tool misuse, prompt injection, and data leakage risks—and practical mitigations.

Model fundamentals and evaluation

  • Working understanding of transformer-based models (attention, tokenization, context windows) and how these constraints influence product and system design.

  • Experience building LLM evaluation pipelines:

    • Offline and online evals (golden datasets, regression tests, shadow deployments, A/B tests).

    • Metrics such as task success rate, exact match / rubric scoring, faithfulness, latency, cost per successful task, and safety outcomes.

    • Methods like judge-model scoring, human review workflows, and adversarial test sets.

  • Exposure to prompt/version management and reproducible experimentation (dataset versioning, prompt diffs, model snapshot tracking).

Retrieval and knowledge systems (RAG)

  • Experience designing and operating Retrieval-Augmented Generation systems:

    • Document ingestion, normalization, chunking strategies (semantic, recursive, structural), metadata enrichment, and deduplication.

    • Vector search and hybrid search (BM25 + dense retrieval), query rewriting, re-ranking, and caching strategies.

    • Managing vector stores and indexes, and tuning for latency, recall, and precision under load.

  • Understanding of how retrieval choices affect grounding, citation quality, and freshness, and how to detect retrieval regressions.

Cloud and platform engineering

  • Experience deploying services on cloud infrastructure (AWS/GCP/Azure), including containers (Docker), orchestration (Kubernetes/ECS), and CI/CD automation.

  • Comfort with storage and messaging primitives: relational DBs, Redis, queues/streams, object storage, and background job frameworks.

  • Ability to document trade-offs clearly (quality vs. latency vs. cost; determinism vs. flexibility; synchronous vs. async execution).

Customer-first delivery

  • You love shipping. You translate ambiguous user needs into measurable outcomes and production-ready implementations, balancing velocity with correctness, safety, and operational rigor.

  • You collaborate well in distributed teams and value clear written communication, pragmatic design docs, and healthy engineering practices.

Things You’ll Do

Build and scale LLM-powered systems

  • Design and ship LLM-backed product features that require high reliability—leveraging tool calling, structured outputs (JSON schemas), and workflow orchestration.

  • Implement prompting and orchestration patterns such as:

    • Multi-step reasoning with intermediate artifacts.

    • Tool routing and fallback models.

    • Deterministic post-processing and validation layers.

  • Integrate safeguards like input/output filtering, schema validation, policy enforcement, and secure handling of sensitive data.

Engineer feedback loops and continuous improvement

  • Instrument AI features to capture high-signal telemetry (user outcomes, tool success/failure, latency per step, token usage, cost per request).

  • Build automated pipelines that convert production signals into:

    • New eval datasets and regression suites.

    • Prompt/model updates that can be tested via shadow traffic and progressive rollout.

  • Establish quality gates in CI/CD (eval thresholds, safety checks, performance budgets).

Build shared infrastructure and developer tooling

  • Develop internal libraries/SDKs for consistent AI development (prompt templates, tool schema generators, standardized tracing, error taxonomies).

  • Build an evaluation platform that supports:

    • Dataset management, reruns, comparison views, and experiment tracking.

    • Model/provider abstraction layers (swappable models, consistent interfaces).

  • Implement orchestration systems that support concurrency control, retries, idempotency, and cost-aware execution.

Retrieval and knowledge platform work

  • Build ingestion and indexing pipelines, optimize chunking/index strategies, implement hybrid search and reranking, and ensure grounding quality.

  • Add monitoring for retrieval quality (coverage, recall proxies, “no-good-retrieval” detectors) and regression alerts.

Improve cost observability and efficiency

  • Implement cost controls: budget alerts, per-feature cost attribution, caching, batching, model tiering, and dynamic routing based on task complexity.

  • Optimize for cost-per-successful-task, not just token reduction—balancing quality, latency, and spend.

Operate production systems

  • Own service health in production: dashboards, alerts, incident response, root cause analysis, and post-incident improvements.

  • Proactively address failures such as model degradation, provider latency spikes, tool execution errors, and data pipeline drift.

Technical Environment (Representative)

  • Languages: Python, TypeScript

  • Runtime: containerized services (Docker), orchestration (Kubernetes/ECS equivalent)

  • APIs: REST/GraphQL, internal service RPC patterns

  • Data: Postgres/MySQL, Redis, object storage, queues/streams

  • Observability: OpenTelemetry tracing, metrics/logging, alerting

  • AI Stack: LLM providers, tool/function calling, RAG with vector + hybrid search, evaluation pipelines, experiment tracking

Nice to Have

  • Experience with security and privacy controls for AI systems (PII handling, secret management, sandboxing, policy enforcement).

  • Experience building workflow engines or orchestration layers (DAG execution, distributed task queues).

  • Experience with model routing, fine-tuning adapters, or embedding model optimization.

  • Prior work on reliability engineering for probabilistic systems (SLOs for non-deterministic components).

Not for you? SherlockTalent offers a $2,500 referral bonus for successful placements into this role. Include your name in the “Referral Source” field on the application.

Apply for this Job

First Name*
Last Name*
Email*
Phone Number*
Zip Code*
Occupation*
Top 3 Tech Skills*
Resume URL
Upload Resume
SherlockTalent is committed to protecting and respecting your privacy. We’ll only store and use your personal information to administer your account and to provide you with information by email and SMS related to your account, job search, and career.
Referral Source

Or email your resume to resumes@sherlocktalent.com