Search

Search IconIcon to open search

AI Whitepapers

Last updatedUpdated: by Simon Späti · CreatedCreated:

Similar to Data Engineering Whitepapers is here the papers related to AI and AI Agents.


# Large Language Models

LLMs are the general intelligence made out of large neural network.


# Anthropic

Anthropic, the company behind Claude.

  • Constitutional AI: Harmlessness from AI Feedback
  • Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples: This research demonstrates that poisoning attacks on LLMs require a near-constant number of malicious documents regardless of dataset size—as few as 250 poisoned samples can successfully backdoor models from 600M to 13B parameters, even though larger models train on 20× more clean data. The findings reveal that attack difficulty does not scale with model size, making data poisoning increasingly practical for large models since the adversary’s requirements remain fixed while training datasets grow proportionally larger.
    • Supporting blog post
    • Small portion of content can affect the LLMs, unlike we thought. Best example is also a small Reddit threads which most models are trained on. But even more, you can add sudo

# Retrieval-Augmented Generation (RAG)

RAG is technique to provide more context to generative large language model.


# AI Tests

These are tests where models get tested.

# Context

Best way to store context.

# Vibe Coding

How Vibe Coding is affecting AI.

  • Vibe Coding Kills Open Source: Generative AI is changing how software is produced and used. In vibe coding, an AI agent builds software by selecting and assembling open-source software (OSS), often without users directly reading documentation, reporting bugs, or otherwise engaging with maintainers. We study the equilibrium effects of vibe coding on the OSS ecosystem. ^de60c1

# Futher Lists


Origin: Data Engineering Whitepapers