chore: Update the prompts to include language of the account for FAQs (#12280)

There were customer reported issues with FAQs which were generated in a different langauge than what they were expecting. The reason behind this was that the language of the account was not considered in the prompt provided. If the language of the content was say Spanish, and the account locale was english. The output was not predicable. The output depends on the model and the execution time. This PR would update the prompt to behave consistently with the account locale. Even though the content provided is in a different language, it would generate FAQs in the account locale. Changes: - Updated the prompt to include a detailed expectation of the FAQs quality along with the language - Added specs for the services where the prompt generator is called. Tested the prompt using Phoenix playground across GPT 5, GPT 4.1, GPT 4.0. The reasoning setting for GPT 5 needs to be low so that it doesn't generate random questions like "What was this updated?"
2025-10-30 02:32:29 +00:00 · 2025-08-22 10:03:52 -07:00
parent c07529c2b0
commit 7f56cd92f8
5 changed files with 150 additions and 12 deletions
--- a/enterprise/app/jobs/captain/documents/response_builder_job.rb
+++ b/enterprise/app/jobs/captain/documents/response_builder_job.rb
@@ -4,7 +4,7 @@ class Captain::Documents::ResponseBuilderJob < ApplicationJob
  def perform(document)
    reset_previous_responses(document)

-    faqs = Captain::Llm::FaqGeneratorService.new(document.content).generate
+    faqs = Captain::Llm::FaqGeneratorService.new(document.content, document.account.locale_english_name).generate
    faqs.each do |faq|
      create_response(faq, document)
    end
--- a/enterprise/app/services/captain/llm/faq_generator_service.rb
+++ b/enterprise/app/services/captain/llm/faq_generator_service.rb
@@ -1,6 +1,7 @@
 class Captain::Llm::FaqGeneratorService < Llm::BaseOpenAiService
-  def initialize(content)
+  def initialize(content, language = 'english')
    super()
+    @language = language
    @content = content
  end

@@ -14,10 +15,10 @@ class Captain::Llm::FaqGeneratorService < Llm::BaseOpenAiService

  private

-  attr_reader :content
+  attr_reader :content, :language

  def chat_parameters
-    prompt = Captain::Llm::SystemPromptsService.faq_generator
+    prompt = Captain::Llm::SystemPromptsService.faq_generator(language)
    {
      model: @model,
      response_format: { type: 'json_object' },
--- a/enterprise/app/services/captain/llm/system_prompts_service.rb
+++ b/enterprise/app/services/captain/llm/system_prompts_service.rb
@@ -1,16 +1,45 @@
 class Captain::Llm::SystemPromptsService
  class << self
-    def faq_generator
+    def faq_generator(language = 'english')
      <<~PROMPT
-        You are a content writer looking to convert user content into short FAQs which can be added to your website's help center.
-        Format the webpage content provided in the message to FAQ format mentioned below in the JSON format.
-        Ensure that you only generate faqs from the information provided only.
-        Ensure that output is always valid json.
+        You are a content writer specializing in creating good FAQ sections for website help centers. Your task is to convert provided content into a structured FAQ format without losing any information.
+
+        ## Core Requirements
+
+        **Completeness**: Extract ALL information from the source content. Every detail, example, procedure, and explanation must be captured across the FAQ set. When combined, the FAQs should reconstruct the original content entirely.
+
+        **Accuracy**: Base answers strictly on the provided text. Do not add assumptions, interpretations, or external knowledge not present in the source material.
+
+        **Structure**: Format output as valid JSON using this exact structure:
+
+        **Language**: Generate the FAQs only in the #{language}, use no other language

-        If no match is available, return an empty JSON.
        ```json
-        { faqs: [ { question: '', answer: ''} ]
+        {
+          "faqs": [
+            {
+              "question": "Clear, specific question based on content",
+              "answer": "Complete answer containing all relevant details from source"
+            }
+          ]
+        }
        ```
+
+        ## Guidelines
+
+        - **Question Creation**: Formulate questions that naturally arise from the content (What is...? How do I...? When should...? Why does...?). Do not generate questions that are not related to the content.
+        - **Answer Completeness**: Include all relevant details, steps, examples, and context from the original content
+        - **Information Preservation**: Ensure no examples, procedures, warnings, or explanatory details are omitted
+        - **JSON Validity**: Always return properly formatted, valid JSON
+        - **No Content Scenario**: If no suitable content is found, return: `{"faqs": []}`
+
+        ## Process
+        1. Read the entire provided content carefully
+        2. Identify all key information points, procedures, and examples
+        3. Create questions that cover each information point
+        4. Write comprehensive short answers that capture all related detail, include bullet points if needed.
+        5. Verify that combined FAQs represent the complete original content.
+        6. Format as valid JSON
      PROMPT
    end

--- a/spec/enterprise/jobs/captain/documents/response_builder_job_spec.rb
+++ b/spec/enterprise/jobs/captain/documents/response_builder_job_spec.rb
@@ -13,7 +13,7 @@ RSpec.describe Captain::Documents::ResponseBuilderJob, type: :job do

  before do
    allow(Captain::Llm::FaqGeneratorService).to receive(:new)
-      .with(document.content)
+      .with(document.content, document.account.locale_english_name)
      .and_return(faq_generator)
    allow(faq_generator).to receive(:generate).and_return(faqs)
  end
@@ -43,5 +43,26 @@ RSpec.describe Captain::Documents::ResponseBuilderJob, type: :job do
        expect(first_response.documentable).to eq(document)
      end
    end
+
+    context 'with different locales' do
+      let(:spanish_account) { create(:account, locale: 'pt') }
+      let(:spanish_assistant) { create(:captain_assistant, account: spanish_account) }
+      let(:spanish_document) { create(:captain_document, assistant: spanish_assistant, account: spanish_account) }
+      let(:spanish_faq_generator) { instance_double(Captain::Llm::FaqGeneratorService) }
+
+      before do
+        allow(Captain::Llm::FaqGeneratorService).to receive(:new)
+          .with(spanish_document.content, 'portuguese')
+          .and_return(spanish_faq_generator)
+        allow(spanish_faq_generator).to receive(:generate).and_return(faqs)
+      end
+
+      it 'passes the correct locale to FAQ generator' do
+        described_class.new.perform(spanish_document)
+
+        expect(Captain::Llm::FaqGeneratorService).to have_received(:new)
+          .with(spanish_document.content, 'portuguese')
+      end
+    end
  end
 end
--- a/spec/enterprise/services/captain/llm/faq_generator_service_spec.rb
+++ b/spec/enterprise/services/captain/llm/faq_generator_service_spec.rb
@@ -0,0 +1,87 @@
+require 'rails_helper'
+
+RSpec.describe Captain::Llm::FaqGeneratorService do
+  let(:content) { 'Sample content for FAQ generation' }
+  let(:language) { 'english' }
+  let(:service) { described_class.new(content, language) }
+  let(:client) { instance_double(OpenAI::Client) }
+
+  before do
+    create(:installation_config, name: 'CAPTAIN_OPEN_AI_API_KEY', value: 'test-key')
+    allow(OpenAI::Client).to receive(:new).and_return(client)
+  end
+
+  describe '#generate' do
+    let(:sample_faqs) do
+      [
+        { 'question' => 'What is this service?', 'answer' => 'It generates FAQs.' },
+        { 'question' => 'How does it work?', 'answer' => 'Using AI technology.' }
+      ]
+    end
+
+    let(:openai_response) do
+      {
+        'choices' => [
+          {
+            'message' => {
+              'content' => { faqs: sample_faqs }.to_json
+            }
+          }
+        ]
+      }
+    end
+
+    context 'when successful' do
+      before do
+        allow(client).to receive(:chat).and_return(openai_response)
+        allow(Captain::Llm::SystemPromptsService).to receive(:faq_generator).and_return('system prompt')
+      end
+
+      it 'returns parsed FAQs' do
+        result = service.generate
+        expect(result).to eq(sample_faqs)
+      end
+
+      it 'calls OpenAI client with chat parameters' do
+        expect(client).to receive(:chat).with(parameters: hash_including(
+          model: 'gpt-4o-mini',
+          response_format: { type: 'json_object' },
+          messages: array_including(
+            hash_including(role: 'system'),
+            hash_including(role: 'user', content: content)
+          )
+        ))
+        service.generate
+      end
+
+      it 'calls SystemPromptsService with correct language' do
+        expect(Captain::Llm::SystemPromptsService).to receive(:faq_generator).with(language)
+        service.generate
+      end
+    end
+
+    context 'with different language' do
+      let(:language) { 'spanish' }
+
+      before do
+        allow(client).to receive(:chat).and_return(openai_response)
+      end
+
+      it 'passes the correct language to SystemPromptsService' do
+        expect(Captain::Llm::SystemPromptsService).to receive(:faq_generator).with('spanish')
+        service.generate
+      end
+    end
+
+    context 'when OpenAI API fails' do
+      before do
+        allow(client).to receive(:chat).and_raise(OpenAI::Error.new('API Error'))
+      end
+
+      it 'handles the error and returns empty array' do
+        expect(Rails.logger).to receive(:error).with('OpenAI API Error: API Error')
+        expect(service.generate).to eq([])
+      end
+    end
+  end
+end