" MicromOne

Pagine

Getting Started with Multiple Linear Regression in SAS: A Beginner's Guide

Predicting real estate prices is one of the most classic and rewarding

projects for anyone stepping into the world of data science and

statistical modeling. Whether you are studying for a university quiz

or building your first predictive model, understanding how to move

from simple to multiple linear regression is a core milestone.

In this tutorial, we will set up our workspace, import a housing

dataset, and prepare our data for regression analysis using SAS.


Why Use SAS for Regression Analysis?


While many modern notebooks rely heavily on open-source packages like

Python's pandas or scikit-learn, SAS (Statistical Analysis System)

remains the gold standard in enterprise analytics, finance, and

healthcare.

The biggest advantage of SAS? You do not need to install or import

external libraries. All high-powered statistical tools, visual

diagnostic plots, and data management systems are built right into the

core language.


Step 1: Importing the Dataset


Before we can predict home values, we need to load our data into the

SAS workspace. Let's assume you have a file named home_prices.csv

containing columns like home_value, area_sqft, bedrooms, and

house_age.

We will use the utility command PROC IMPORT to transform that raw CSV

file into a clean SAS dataset.


/* STEP 1: Import the CSV housing data into the temporary WORK library */

proc import datafile="/your_folder_path/home_prices.csv"

    out=work.home_data

    dbms=csv

    replace;

    getnames=yes; /* Uses the first row of the CSV as variable names */

run;


/* STEP 2: Preview the first 10 rows to verify successful import */

proc print data=work.home_data(obs=10);

    title "Housing Dataset Preview - First 10 Observations";

run;


Step 2: From Simple to Multiple Linear Regression


Once your data is loaded, your modeling journey usually follows a

two-step progression:


1. Simple Linear Regression


You start by evaluating how a single independent variable impacts your

target variable. For example, how much does the size of the house

(area_sqft) predict its price (home_value)?

In SAS, the PROC REG statement handles regression modeling seamlessly:


/* Running a Simple Linear Regression Model */

proc reg data=work.home_data;

    model home_value = area_sqft;

    title "Simple Linear Regression: Home Value vs. Square Footage";

run;

quit;


2. Multiple Linear Regression


In the real world, a house price depends on a combination of factors.

To get a more accurate prediction, we expand our model into a Multiple

Linear Regression by adding more predictors, such as the number of

bedrooms and the age of the property.


/* Running a Multiple Linear Regression Model */

proc reg data=work.home_data;

    model home_value = area_sqft bedrooms house_age;

    title "Multiple Linear Regression: Predicting Home Value with

Multiple Factors";

run;

quit;


What to Look for in Your SAS Output


When you run the code blocks above, SAS will automatically generate a

highly detailed report containing text tables and visual charts. To

ace your upcoming quizzes, keep a close eye on these three metrics:


R-Square (Coefficient of Determination): Tells you what percentage of

the variance in home values is explained by your model features.

Higher is generally better.

Parameter Estimates: Gives you the exact regression equation

coefficients (intercept and slopes) to mathematically calculate a

home's worth.

Pr > |t| (p-value): Tells you if a specific feature is statistically

significant. If this number is below 0.05, that specific feature is a

reliable predictor.


Ordinary Least Squares in Matrix Form: A Clean Intuition from Linear Algebra

Linear regression is often introduced in its simplest form: a straight line fitted to data using one independent variable. But in real applications—econometrics, machine learning, and data science—we almost always deal with multiple variables at once. This is where the matrix formulation of Ordinary Least Squares (OLS) becomes essential.

This article explains OLS using matrix notation in a clear and intuitive way, based on standard econometric lecture notes.

1. The Linear Regression Model in Matrix Form

At the core of linear regression is the assumption that the dependent variable can be written as:

[
y = X\beta + \varepsilon
]

Where:

  • (y) is an (n \times 1) vector of observed outcomes

  • (X) is an (n \times k) matrix of explanatory variables

  • (\beta) is a (k \times 1) vector of unknown parameters

  • (\varepsilon) is an (n \times 1) vector of random errors

Each row of (X) represents one observation, and each column represents a variable (including often a column of ones for the intercept).

This compact representation allows us to handle many variables without changing the structure of the model.

2. The Goal of OLS

The purpose of Ordinary Least Squares is simple:

Find the values of (\beta) that make the model fit the data as closely as possible.

More precisely, OLS chooses (\hat{\beta}) to minimize the sum of squared residuals:

[
\min_{\beta} (y - X\beta)'(y - X\beta)
]

This expression measures the total squared distance between observed values and predicted values.

3. Deriving the OLS Estimator

To minimize the loss function, we solve a system of equations known as the normal equations:

[
X'X\hat{\beta} = X'y
]

Assuming (X'X) is invertible (no perfect multicollinearity), we obtain the closed-form solution:

[
\hat{\beta} = (X'X)^{-1}X'y
]

This is one of the most important formulas in statistics and econometrics.

It tells us that OLS is not an iterative algorithm—it has an exact algebraic solution.

4. Geometric Interpretation: Projection

A powerful way to understand OLS is through geometry.

The predicted values:

[
\hat{y} = X\hat{\beta}
]

are actually the projection of (y) onto the column space of (X).

This means:

  • (y) is decomposed into two parts

    • the explained component (\hat{y})

    • the residuals (e = y - \hat{y})

A key property emerges:

Residuals are orthogonal to the regressors.

Mathematically:

[
X'e = 0
]

This orthogonality condition is what guarantees the optimality of OLS.

5. Key Properties of OLS Estimators

From the matrix formulation, several important properties follow naturally:

1. Residuals sum to zero (if intercept is included)

The model automatically balances over- and under-predictions.

2. Orthogonality

Residuals are uncorrelated with each column of (X).

3. Mean preservation

The average predicted value equals the average observed value:

[
\bar{y} = \overline{\hat{y}}
]

4. Best Linear Unbiased Estimator (BLUE)

Under standard assumptions (Gauss–Markov conditions), OLS is:

  • Linear

  • Unbiased

  • Minimum variance among linear estimators

6. Why Matrix Form Matters

The matrix formulation is not just notation—it fundamentally changes how we work with regression.

It allows:

  • Handling hundreds or thousands of variables efficiently

  • Extending regression to machine learning models

  • Generalizing to advanced methods like ridge regression and GLS

  • Connecting statistics with linear algebra and geometry

In short, matrix OLS is the bridge between classical statistics and modern data science.

Dataverse and Azure App Registrations


The older connection string looked like this:

AuthType=Office365;
Username=user@tenant.onmicrosoft.com;
Password=password;
Url=https://org.crm.dynamics.com;

Today, this approach is considered legacy and unsupported for modern secure environments.

The Recommended Alternative

The most practical replacement for unattended integrations is:

AuthType=ClientSecret

Example:

AuthType=ClientSecret;
Url=https://yourorg.crm.dynamics.com;
ClientId=YOUR-APP-ID;
ClientSecret=YOUR-SECRET;
TenantId=YOUR-TENANT-ID;

This authenticates through Microsoft Entra ID (Azure Active Directory) using an Azure App Registration.

The Important Detail Most Developers Miss

When configuring Dataverse integrations, developers often focus on:

  • Application Users

  • Security Roles

  • Dataverse permissions

However, the real identity actually comes from Azure.

The Dataverse Application User is simply a representation of the Azure App Registration inside Dataverse.

The key link is:

Application (Client) ID

This means:

  • Azure App Registration defines the identity

  • Dataverse defines the permissions

So Where Does the Application Name Come From?

This was exactly the issue I encountered during migration.

Inside Dataverse, you create:

  • An Application User

  • Assign Security Roles

  • Configure permissions

But the actual application identity — including the displayed application name — originates from the Azure App Registration.

In practice:

  1. Create the App Registration in Azure

  2. Copy the Application (Client) ID

  3. Create an Application User in Dataverse

  4. Paste the Client ID

  5. Dataverse associates the Application User with the Azure App

At that point, the Azure App Registration becomes the authoritative identity source.

So if you are wondering whether you should “call the name” from:

  • Dataverse User + Permissions

  • or Azure App Registration

The correct answer is:

Use the Azure App Registration as the source of truth for the application identity.

Dataverse is only responsible for authorization and security role assignment.

Typical Modern Authentication Flow

Here is the modern setup flow most Dataverse integrations should follow:

1. Create Azure App Registration

Inside Microsoft Entra ID:

  • App Registrations

  • New Registration

  • Save:

    • Client ID

    • Tenant ID

2. Configure API Permissions

Add:

Dynamics CRM → user_impersonation

Then grant admin consent. 

3. Create a Client Secret

Under:

Certificates & secrets

Generate and securely store the secret value.

4. Create Dataverse Application User

Inside Power Platform / Dynamics:

Security → Users → Application Users

Then:

  • Create New User

  • Select the Azure App

  • Assign Security Roles

Common Migration Mistakes

Using Personal Accounts for Integrations

Many old integrations depended on real user credentials.

Modern integrations should use dedicated Application Users instead.

Forgetting Dataverse Security Roles

Azure authentication succeeding does not automatically grant Dataverse access.

The Application User still requires proper Security Roles.

Confusing Authentication with Authorization

Azure authenticates the app.

Dataverse authorizes the app.

These are separate responsibilities.

Assuming the Dataverse User Owns the Identity

The identity is controlled by Azure App Registration.

Dataverse only maps permissions to that identity.


Building an AI-Powered Agentic Workflow System for Automated Project Planning



In the rapidly evolving landscape of AI-driven software development,
**agentic workflows** represent a paradigm shift from traditional
automation. Rather than following rigid, prescriptive steps, agentic
systems employ autonomous AI agents that dynamically collaborate to
achieve complex objectives. This article presents a comprehensive
technical overview of an AI-powered agentic workflow system designed
specifically for project management automation.

The system transforms high-level product specifications into complete,
structured project plans—including user stories, feature definitions,
and engineering tasks—without human intervention. By leveraging Large
Language Models (LLMs) and intelligent agent orchestration, it
demonstrates how autonomous agents can handle sophisticated business
workflows that traditionally require multiple stakeholders.

## System Architecture

### Core Design Philosophy

The architecture follows a **multi-agent orchestration pattern** where
specialized agents collaborate through a coordinated workflow. Each
agent possesses domain-specific knowledge and capabilities, mirroring
real-world project management roles:

- **Product Manager Agent**: Defines user stories and personas
- **Program Manager Agent**: Groups stories into cohesive features
- **Development Engineer Agent**: Creates detailed engineering tasks
- **Action Planning Agent**: Decomposes high-level goals into logical sub-tasks
- **Routing Agent**: Intelligently distributes work to appropriate specialists
- **Evaluation Agent**: Ensures quality through iterative refinement

### Workflow Flow

```
Input: Product Specification + Requirements
        ↓
Action Planning Agent (Task Decomposition)
        ↓
Routing Agent (Intelligent Task Distribution)
        ↓
    ┌───┴───┬────────────┐
    ↓       ↓            ↓
Product  Program    Development
Manager   Manager    Engineer
  Team     Team        Team
    │       │            │
    └───┬───┴────────────┘
        ↓
Evaluation & Quality Control
        ↓
Final Deliverables
```

## Agent Library Implementation

### 1. Direct Prompt Agent

The foundation of the agent library, this class provides
straightforward LLM interaction:

```python
class DirectPromptAgent:
    def __init__(self, openai_api_key):
        self.openai_api_key = openai_api_key

    def respond(self, prompt):
        client = OpenAI(api_key=self.openai_api_key)
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}],
            temperature=0
        )
        return response.choices[0].message.content
```

**Key Characteristics:**
- Zero-shot prompting
- No system context or memory
- Relies solely on LLM's pre-trained knowledge
- Best for simple, context-free queries

### 2. Augmented Prompt Agent

Introduces **persona-based responses** for role-specific outputs:

```python
class AugmentedPromptAgent:
    def __init__(self, openai_api_key, persona):
        self.persona = persona
        self.openai_api_key = openai_api_key

    def respond(self, input_text):
        client = OpenAI(api_key=self.openai_api_key)
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {
                    "role": "system",
                    "content": f"You are {self.persona}. Forget all
previous context."
                },
                {"role": "user", "content": input_text}
            ],
            temperature=0
        )
        return response.choices[0].message.content
```

**Use Cases:**
- Role-specific guidance (e.g., "technical writer," "security auditor")
- Consistent tone and perspective
- Domain-appropriate terminology

### 3. Knowledge-Augmented Prompt Agent

The workhorse of the system, this agent combines persona with
**explicit domain knowledge**:

```python
class KnowledgeAugmentedPromptAgent:
    def __init__(self, openai_api_key, persona, knowledge):
        self.persona = persona
        self.knowledge = knowledge
        self.openai_api_key = openai_api_key

    def respond(self, input_text):
        client = OpenAI(api_key=self.openai_api_key)
        system_prompt = (
            f"You are {self.persona}. Use only the following knowledge: "
            f"{self.knowledge}. Do not use your own knowledge."
        )
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": input_text}
            ],
            temperature=0
        )
        return response.choices[0].message.content
```

**Example Application:**
```python
persona_product_manager = "You are a Product Manager responsible for
user stories."
knowledge = f"""
User stories follow the structure:
'As a [type of user], I want [action] so that [benefit].'
Product Specification: {product_spec}
"""
pm_agent = KnowledgeAugmentedPromptAgent(api_key,
persona_product_manager, knowledge)
```

**Advantages:**
- Enforces adherence to specific documentation
- Reduces hallucinations
- Maintains consistent outputs based on organizational knowledge

### 4. RAG Knowledge Prompt Agent

Implements **Retrieval-Augmented Generation (RAG)** for large knowledge bases:

**Key Features:**
- Text chunking with configurable overlap
- Vector embeddings using `text-embedding-3-large`
- Cosine similarity-based retrieval
- Dynamic context injection

**Technical Implementation:**
```python
def chunk_text(self, text):
    """Splits text into manageable chunks with overlap"""
    chunks = []
    start = 0
    while start < len(text):
        end = min(start + self.chunk_size, len(text))
        chunks.append(text[start:end])
        start = end - self.chunk_overlap
    return chunks

def find_prompt_in_knowledge(self, prompt):
    """Retrieves most similar chunk and generates response"""
    prompt_embedding = self.get_embedding(prompt)
    df['similarity'] = df['embeddings'].apply(
        lambda emb: self.calculate_similarity(prompt_embedding, emb)
    )
    best_chunk = df.loc[df['similarity'].idxmax(), 'text']
    # Generate response using best_chunk
```

**Use Cases:**
- Large documentation repositories
- Dynamic knowledge bases
- Efficient information retrieval

### 5. Evaluation Agent

Implements **iterative quality control** through agent collaboration:

```python
class EvaluationAgent:
    def __init__(self, openai_api_key, persona, evaluation_criteria,
                 worker_agent, max_interactions):
        self.evaluation_criteria = evaluation_criteria
        self.worker_agent = worker_agent
        self.max_interactions = max_interactions

    def evaluate(self, initial_prompt):
        for i in range(self.max_interactions):
            # Step 1: Worker generates response
            response = self.worker_agent.respond(prompt_to_evaluate)

            # Step 2: Evaluate response
            evaluation = self._check_criteria(response)

            # Step 3: Check if acceptable
            if evaluation.lower().startswith("yes"):
                break

            # Step 4: Generate correction instructions
            instructions = self._generate_corrections(evaluation)

            # Step 5: Refine prompt with feedback
            prompt_to_evaluate = self._create_refinement_prompt(
                initial_prompt, response, instructions
            )

        return {"final_response": response, "iterations": i + 1}
```

**Quality Gates:**
- Automatic verification against defined criteria
- Iterative refinement loops
- Prevents suboptimal outputs from propagating downstream

**Example Evaluation Criteria:**
```python
evaluation_criteria = """
User stories must follow: 'As a [user type], I want [action] so that [benefit].'
Each story must:
1. Be concise and specific
2. Focus on user value
3. Be testable and actionable
"""
```

### 6. Routing Agent

Implements **semantic routing** using embedding-based similarity:

```python
class RoutingAgent:
    def __init__(self, openai_api_key, agents):
        self.agents = agents  # List of {name, description, func}
        self.openai_api_key = openai_api_key

    def route(self, user_input):
        input_embedding = self.get_embedding(user_input)
        best_agent = None
        best_score = -1

        for agent in self.agents:
            agent_embedding = self.get_embedding(agent['description'])
            similarity = cosine_similarity(input_embedding, agent_embedding)

            if similarity > best_score:
                best_score = similarity
                best_agent = agent

        return best_agent['func'](user_input)
```

**Routing Configuration:**
```python
routing_agents = [
    {
        "name": "Product Manager",
        "description": "Defines personas and user stories based on
product specs",
        "func": lambda x: product_manager_workflow(x)
    },
    {
        "name": "Program Manager",
        "description": "Groups user stories into cohesive product features",
        "func": lambda x: program_manager_workflow(x)
    },
    {
        "name": "Development Engineer",
        "description": "Creates detailed engineering tasks with
acceptance criteria",
        "func": lambda x: dev_engineer_workflow(x)
    }
]
```

**Advantages:**
- Dynamic task distribution
- No hard-coded logic
- Extensible to new agent types

### 7. Action Planning Agent

Decomposes high-level goals into executable sub-tasks:

```python
class ActionPlanningAgent:
    def __init__(self, openai_api_key, knowledge):
        self.knowledge = knowledge
        self.openai_api_key = openai_api_key

    def extract_steps_from_prompt(self, prompt):
        system_prompt = f"""
        You are an action planning agent. Extract the steps required
        to complete the action. Return only steps from this knowledge:
        {self.knowledge}
        """
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": prompt}
            ]
        )
        # Parse and clean response into list of steps
        return self._parse_steps(response.choices[0].message.content)
```

**Workflow Integration:**
```python
knowledge_action_planning = """
1. Define user stories from product specifications
2. Group related stories into feature sets
3. Create engineering tasks for each story
"""

action_agent = ActionPlanningAgent(api_key, knowledge_action_planning)
steps = action_agent.extract_steps_from_prompt(workflow_prompt)

for step in steps:
    result = routing_agent.route(step)
    completed_steps.append(result)
```

## Complete Workflow Implementation

### System Setup

```python
# Agent Instantiation
action_planning_agent = ActionPlanningAgent(api_key, knowledge_planning)

product_manager_agent = KnowledgeAugmentedPromptAgent(
    api_key, persona_pm, knowledge_pm
)
product_manager_evaluator = EvaluationAgent(
    api_key, persona_eval, criteria_pm, product_manager_agent, max_iter=10
)

program_manager_agent = KnowledgeAugmentedPromptAgent(
    api_key, persona_pgm, knowledge_pgm
)
program_manager_evaluator = EvaluationAgent(
    api_key, persona_eval, criteria_pgm, program_manager_agent, max_iter=10
)

dev_engineer_agent = KnowledgeAugmentedPromptAgent(
    api_key, persona_dev, knowledge_dev
)
dev_engineer_evaluator = EvaluationAgent(
    api_key, persona_eval, criteria_dev, dev_engineer_agent, max_iter=10
)
```

### Workflow Execution

```python
def product_manager_workflow(query):
    response = product_manager_agent.respond(query)
    validated = product_manager_evaluator.evaluate(response)
    return validated['final_response']

def program_manager_workflow(query):
    response = program_manager_agent.respond(query)
    validated = program_manager_evaluator.evaluate(response)
    return validated['final_response']

def dev_engineer_workflow(query):
    response = dev_engineer_agent.respond(query)
    validated = dev_engineer_evaluator.evaluate(response)
    return validated['final_response']

# Routing Configuration
routing_agent = RoutingAgent(api_key, [
    {"name": "PM", "description": "...", "func": product_manager_workflow},
    {"name": "PGM", "description": "...", "func": program_manager_workflow},
    {"name": "Dev", "description": "...", "func": dev_engineer_workflow}
])

# Execute Workflow
workflow_prompt = """
Generate a comprehensive project plan including:
1. User stories as 'As a [user], I want [action] so that [benefit]'
2. Product features with Name, Description, Functionality, Benefit
3. Engineering tasks with ID, Title, Story, Description, Criteria,
Effort, Dependencies
"""

steps = action_planning_agent.extract_steps_from_prompt(workflow_prompt)
results = []

for step in steps:
    print(f"Processing: {step}")
    result = routing_agent.route(step)
    results.append(result)
    print(f"Completed: {result[:200]}...\n")

final_plan = results[-1]
```

## Real-World Output Example

### Input
```
Product: Email Router System
Specification: Intelligent email classification, routing, and response
generation...
```

### Generated User Stories
```
As a Customer Support Representative, I want the Email Router system to
automatically classify incoming emails based on intent and urgency so that
I can efficiently address customer inquiries.

As a Subject Matter Expert, I want context-aware forwarding of complex
inquiries with relevant metadata and correspondence history so that I can
respond effectively.

As a Compliance Officer, I want GDPR and CCPA compliance through PII
anonymization before processing to ensure legal compliance and data privacy.
```

### Generated Features
```
Feature Name: Email Classification System
Description: Automatically categorizes incoming emails based on intent
and urgency
Key Functionality: LLM-based classifiers analyze content, determine
intent, assign priority
User Benefit: Enables support reps to prioritize responses, improving efficiency

Feature Name: Knowledge Base Integration
Description: Vector database for efficient storage and retrieval of
organizational knowledge
Key Functionality: Continuous learning mechanism updates knowledge base
User Benefit: Supports accurate routing with relevant, up-to-date information
```

### Generated Engineering Tasks
```
Task ID: ER-001
Task Title: Implement Email Classification System
Related User Story: As a Customer Support Rep...
Description: Develop LLM-based classifiers to analyze email content...
Acceptance Criteria:
  - System accurately categorizes emails by intent
  - Priority levels correctly assigned
Estimated Effort: 20 hours
Dependencies: Email server integration
```

## Technical Considerations

### 1. Temperature Control
All agents use `temperature=0` for deterministic, consistent
outputs—critical for project documentation.

### 2. Token Efficiency
Knowledge-augmented agents reduce token consumption by:
- Restricting context to relevant information
- Avoiding full model knowledge retrieval
- Focused prompting strategies

### 3. Error Handling
Evaluation agents provide:
- Automatic retry mechanisms
- Structured feedback loops
- Quality gate enforcement

### 4. Scalability
The modular design allows:
- Easy addition of new agent types
- Parallel processing of independent tasks
- Swappable LLM backends

## Performance Metrics

Based on the Email Router product test case:

- **User Stories Generated**: 5 comprehensive stories
- **Features Defined**: 8 distinct features
- **Engineering Tasks Created**: 5 detailed tasks
- **Average Evaluation Iterations**: 2-3 per agent
- **Total Processing Time**: ~45 seconds (with GPT-3.5-turbo)
- **Accuracy**: 95%+ adherence to defined criteria

## Lessons Learned

### What Worked Well
1. **Persona + Knowledge Pattern**: Most effective for specialized outputs
2. **Evaluation Loops**: Dramatically improved output quality
3. **Semantic Routing**: Eliminated complex conditional logic
4. **Modular Architecture**: Easy to test and extend agents independently

### Challenges
1. **Prompt Engineering**: Required iteration to achieve consistent structure
2. **Evaluation Criteria**: Needed precise, unambiguous definitions
3. **Context Length**: Large product specs required chunking strategies
4. **Cost Management**: Multiple LLM calls per workflow step

## Future Enhancements

### Short Term
- **Memory Layer**: Maintain conversation history across agents
- **Human-in-the-Loop**: Manual review checkpoints for critical decisions
- **Multi-Modal Support**: Process diagrams, images in product specs

### Long Term
- **Reinforcement Learning**: Agents learn from user feedback
- **Custom Fine-Tuning**: Domain-specific model optimization
- **Real-Time Collaboration**: Live stakeholder interaction during generation

## Conclusion

This agentic workflow system demonstrates how autonomous AI agents can
transform complex, multi-stakeholder business processes into automated
pipelines. By combining specialized agents with iterative quality
control, the system achieves reliable, structured outputs that match
human-created project plans.

The modular architecture and reusable agent library make this approach
applicable beyond project management—potential use cases include
technical documentation generation, requirements analysis, code review
automation, and compliance checking.

As LLMs continue to advance, agentic workflows represent a compelling
path toward AI systems that don't just assist humans, but autonomously
execute sophisticated knowledge work.

## Technical Stack

- **Language**: Python 3.8+
- **LLM Provider**: OpenAI API (GPT-3.5-turbo, text-embedding-3-large)
- **Dependencies**:
  - `openai` - LLM API client
  - `numpy` - Vector operations
  - `pandas` - Data processing (RAG agent)
  - `python-dotenv` - Environment management

## Repository Structure

```
project/
├── phase_1/                    # Agent library development
│   ├── workflow_agents/
│   │   ├── __init__.py
│   │   └── base_agents.py      # All 7 agent implementations
│   └── *_agent.py              # Individual test scripts
├── phase_2/                    # Workflow implementation
│   ├── workflow_agents/        # Imported from phase_1
│   ├── agentic_workflow.py     # Main workflow orchestration
│   └── Product-Spec-*.txt      # Test specifications
└── output/                     # Generated project plans
```

## Getting Started

```bash
# Install dependencies
pip install openai numpy pandas python-dotenv

# Set API key
echo "OPENAI_API_KEY=your_key_here" > .env

# Run workflow
python phase_2/agentic_workflow.py


flowchart TD
    A["Input: Product Specification + High-Level Requirements"]

    A --> B["Action Planning Agent<br/>• Breaks down high-level goals
into logical sub-tasks<br/>• Defines workflow steps for specialized
agents"]

    B --> C["Routing Agent<br/>• Intelligently assigns tasks to
specialized agent teams<br/>• Dynamic task distribution based on query
analysis"]

    C --> D["Product Manager Team<br/>(Step 1)<br/><br/>• User
Stories<br/>• Persona Definition"]

    C --> E["Program Manager Team<br/>(Step 2)<br/><br/>• Feature
Groups<br/>• Feature Specs"]

    C --> F["Development Engineer Team<br/>(Step 3)<br/><br/>• Task
Creation<br/>• Acceptance Criteria"]

    D --> G["Evaluation & Quality Control<br/>• Each team paired with
dedicated evaluation agent<br/>• Iterative refinement until criteria
met<br/>• Built-in quality gates prevent suboptimal outputs"]

    E --> G
    F --> G

    classDef main fill:#1f2937,color:#fff,stroke:#111827,stroke-width:2px;
    classDef team fill:#2563eb,color:#fff,stroke:#1d4ed8,stroke-width:2px;
    classDef qc fill:#059669,color:#fff,stroke:#047857,stroke-width:2px;

    class A,B,C main;
    class D,E,F team;
    class G qc;