Logo
Generating blog articles with RAG

Generating blog articles with RAG

Tags
design
LLM pipeline
technical requirements
Published
2023
Author

Overview

Researching accurate facts and figures for blog articles can take up a good chunk of the writing process. Our team developed a step-by-step workflow for AI-powered blog article creation, integrating RAG (retrieval-augmented generation) to enhance content quality and relevance.
I worked on:
v1
  • Designing an AI-assisted, step-by-step process and LLM pipeline for blog article creation
  • Prompt engineering to improve how the AI understands and uses different writing styles
v2
  • Mapping out retrieval of additional data from reliable sources, data cleaning
  • How to integrate retrieved information into the article generation pipeline for more robust content
v3

LLM pipeline

notion image
 
  1. User inputs and content outlining
      • Guide user to provide topic or keywords
      • Generate structured outline options using LLM for user to mix-and-match topic sentences
  1. Search retrieval integration
      • Query external data and trusted sources
      • Incorporate key points from retrieved data as talking points
      • Show citations and links to source material
  1. Article generation
      • Apply tone of voice and article length
      • Combine user-provided data, generated data, and retrieved information to create a full-length article
      • Naturally integrate target keywords

Key features

1️⃣
notion image
Drag and drop outline builder: Added a mix-and-match feature for multiple outlines while preserving generated history. This design choice came from the observation that writers often revise their outlines and dislike losing content they've already created.
  • Allows combining different outline versions with drag-and-drop
  • Capitalizes on loss aversion principle
  • Speeds up outline creation process
 
 
1️⃣
notion image
Sub-points with RAG search integration: This feature enhanced content depth and reduced hallucinations by allowing user-added dot points for each outline section, with a programmatic “Research" button to integrate real-time internet searches to quarterback. I wanted to balance automated research with human creativity and encourage writers to be more hands-on and create value where it mattered—not just create a content mill.
  • Allows writers to add and edit points to ensure their own personal perspective
  • Integrates real-time internet searches for additional context
  • Streamlines research process within the platform and prevent context switching
 
1️⃣
notion image
Source integration: I designed a source section with links to sources directly within each section, inspired by bibliography formats. This design choice serves two purposes: it builds authority and helps users trust the information by allowing them to verify it themselves.
  • Intentionally prominent display for user awareness
  • Builds content authority
  • Allows users to verify information easily
Displaying so much information in one section can overwhelming—future iterations might include options to collapse sources or make them less conspicuous to better balance usability with transparency.
1️⃣
notion image
 
Style, tone, and inputs control panel: A customizable interface gives writers precise control over their content's voice and structure. This allows users to fine-tune their writing and make sure it aligns with their intended style and audience expectations.
  • Fine-grained inputs for adjusting writing style
  • Designed to accommodate various writing workflows
  • Allows writers to tailor their content for specific audiences or platforms
Since it’s a secondary feature with many options, I wanted to keep it clutter-free while keeping it discoverable. I highlighted the panel in the onboarding tutorial sequence to improve discovery instead.
 
 
I also shot and edited a tutorial video as part of our video onboarding series:
Video preview

Takeaways

From a technical standpoint, integrating RAG was an exercise in compression rather than expansion. There was a tradeoff between the the depth of results, versus surfacing the results to users quickly, and our data retrieval, cleaning, and processing pipeline had to be significantly optimized.

Future improvements

  • Implement SEO-focused features, such as keyword density analysis and suggestions
  • Add multi-modal content creation and retrieval for rich media integration
  • Create collaborative features for team-based article creation and editing