How AI Is Revolutionizing Agile Program Management with Confluence & Streamlit
With Copilot integrated into the organization’s applications, finding rarely used data from files, SharePoint, and other accessible sources has become incredibly easy. I’ve been relying heavily on this Gen AI capability. One day, I needed a summary view of all the features (team’s deliverables for a Quarter in Agile Framework) and their statuses that the team is working on. Unfortunately, Copilot denied reading data from the Confluence page, which is ideal and expected. Many organizations, projects, and program updates are stored on Confluence pages. Getting an overview of team goals, deliverables, project risks, and status can be time-consuming for a leader or a person handling multiple Programs. I thought, why not have an intelligent assistant to fetch and summarize that information? It will be an effective enabler for the Program Manager and senior leadership.
The introduction of Agentic AI was a savior for me, and I decided to pick this framework as a solution. However, there were challenges: which framework should be used, and is there an available open source? How costly would the managed platform be? Finally, with all the research, I decided to go with open-source and use the below tech stack to build a lightweight AI assistant using:
Step 1. Confluence Page LookUp.
Instead of manually pasting URLs, each team name is mapped to its Confluence URL using a dictionary. When a user selects “Team A” from the right pane, the backend automatically fetches the associated URL and triggers scraping.
Step 2. Web Scraping via Playwright
This step took some time. Finally, I ended up using Playwright for headless browser-based scraping, which helps us load dynamic content and handle login:
Step 3. Define Client, Agent, and Prompt Engineering with Semantic Kernel.
The tool is meant to be a Program management Enabler. The Agent Instructions are drafted to use the scraped content and produce a result suitable for PM questions. The instructions will help get output as a textual summary or in a chart format. This is also an example of low code.
In addition, I defined the client as an AI Agent as plugin.
Step 4. Decide on User Input to process the question with or without the tool.
I decided to add an additional LLM client to check whether the user input is relevant to Program Management or not.
Step 5. The final step is to produce the result. Here is the entire code.
Here is the entire code. I deleted my project-specific details. We need to store the state.json first to use it in the code
import json import os import asyncio import pandas as pd import streamlit as st from typing import Annotated from dotenv import load_dotenv from openai import AsyncAzureOpenAI from playwright.async_api import async_playwright from bs4 import BeautifulSoup from semantic_kernel.functions import kernel_function from typing import Annotated import re import matplotlib.pyplot as plt from semantic_kernel.agents import ChatCompletionAgent from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion from semantic_kernel.contents import FunctionCallContent, FunctionResultContent, StreamingTextContent from semantic_kernel.functions import kernel_function
TEAM_URL_MAPPING = { “Team 1”: “Confluence URL for Team 1”, “Team 2”: “Confluence URL for Team 2”, “Team 3”: “Confluence URL for Team 3”, “Team 4”: “Confluence URL for Team 4”, “Team 5”: “Confluence URL for Team 5”, “Team 6”: “Confluence URL for Team 6” }
# —- Plugin definition —-
#Bar chart with fixed size def plot_bar_chart(df): status_counts = df[“status”].value_counts() fig, ax = plt.subplots(figsize=(1.5, 1)) # width, height in inches ax.bar(status_counts.index, status_counts.values, color=”#4CAF50″) ax.set_title(“Features by Status”) ax.set_ylabel(“Count”) # Change tick color ax.tick_params(axis=’x’, colors=’blue’, labelrotation=90) # x-ticks in blue, rotated ax.tick_params(axis=’y’, colors=’green’) # y-ticks in green st.pyplot(fig)
def extract_json_from_response(text): # Use regex to find the first JSON array in the text match = re.search(r”(\[\s*{.*}\s*\])”, text, re.DOTALL) if match: return match.group(1) return None
class ConfluencePlugin: def
@kernel_function(description="Scrape and return text content from a Confluence page.")
async def get_confluence_page_content(
self, team_name: Annotated\[str, "Name of the Agile team"\]
) -> Annotated\[str, "Returns extracted text content from the page"\]:
print(f"Attempting to scrape Confluence page for team: '{team_name}'") # Added for debugging
target_url = TEAM_URL_MAPPING.get(team_name)
if not target_url:
print(f"Failed to find URL for team: '{team_name}' in TEAM_URL_MAPPING.") # Added for debugging
return f"❌ No Confluence URL mapped for team '{team_name}'"
async with async_playwright() as p:
browser = await p.chromium.launch()
context = await browser.new_context(storage_state="state.json")
page = await context.new_page()
pages_to_scrape = \[target_url\]
# Loop through each page URL and scrape the content
for page_url in pages_to_scrape:
await page.goto(page_url)
await asyncio.sleep(30) # Wait for the page to load
await page.wait_for_selector('div.refresh-module-id, table.some-jira-table')
html = await page.content()
soup = BeautifulSoup(html, "html.parser")
body_div = soup.find("div", class_="wiki-content") or soup.body
if not body_div:
return "❌ Could not find content on the Confluence page."
# Process the scraped content (example: extract headings)
headings = soup.find_all('h2')
text = body_div.get_text(separator="\\n", strip=True)
return text\[:4000\] # Truncate if needed to stay within token limits
await browser.close()
@kernel_function(description="Summarize and structure scraped Confluence content into JSON.")
async def summarize_confluence_data(
self, raw_text: Annotated\[str, "Raw text scraped from the Confluence page"\],
output_style: Annotated\[str, "Output style, either 'bullet' or 'json'"\] = "json" # Default to 'json'
) -> Annotated\[str, "Returns structured summary in JSON format"\]:
prompt = f"""
You are a Program Management Data Extractor.
Your job is to analyze the following Confluence content and produce structured machine-readable output.
Confluence Content:
{raw_text}
Instructions:
- If output_style is 'bullet', return bullet points summary.
- If output_style is 'json', return only valid JSON array by removing un printable characters and spaces from beginning and end.
- DO NOT write explanations.
- DO NOT suggest code snippets.
- DO NOT wrap JSON inside triple backticks \`\`\`json
- Output ONLY the pure JSON array or bullet points list.
Output_style: {output_style}
"""
# Call OpenAI again
completion = await client.chat.completions.create(
model="gpt-4o",
messages=\[
{"role": "system", "content": "You are a helpful Program Management Data Extractor."},
{"role": "user", "content": prompt}
\],
temperature=0.1
)
structured_json = completion.choices\[0\].message.content.strip()
return structured_json
# —- Load API credentials —- load_dotenv() client = AsyncAzureOpenAI( azure_endpoint=”<>”, api_key=os.getenv(“AZURE_API_KEY”), api_version='<>’ ) chat_completion_service = OpenAIChatCompletion( ai_model_id=”<>”, async_client=client )
AGENT_INSTRUCTIONS = “””You are a helpful Program Management AI Agent that can help extract key information such as Team Member, Features, Epics from a confluence page.
Important: When users specify a Team page, only extract the Features and Epics of that team.
When the conversation begins, introduce yourself with this message: “Hello! I’m your PM assistant. I can help get you status of Features and Epics.
Steps you MUST follow: 1. Always first call `get_confluence_page_content` to scrape the Confluence page.
- If the user’s message starts with “Team: {team_name}.”, use that {team_name} for the `team_name` argument. For example, if the input is “Team: Raptor. What are the latest features?”, the `team_name` is “Raptor”. 2. If the user asks for a summary, provide a bullet points list. 3. If the user asks for a JSON array or chart or plot. Then immediately call `summarize_confluence_data` using the scraped content. 4. Based on the output style requested by the user, return either a JSON array or bullet points. 5. If the user doesn’t specify an output style, default to bullet point summary. 6. If the user asks for a JSON array, return only valid JSON and plot chart/graph.
Instructions: – If output_style is ‘bullet’, return bullet points summary. – If output_style is ‘json’, return only valid JSON array by removing un printable characters and spaces from beginning and end. – DO NOT write explanations. – DO NOT suggest code snippets. – DO NOT wrap JSON inside triple backticks “`json – Output ONLY the pure JSON array or bullet points list.
What team are you interested to help you plan today?”
Always prioritize user preferences. If they mention a specific team, focus your data on that team rather than suggesting alternatives. “”” agent = ChatCompletionAgent( service=chat_completion_service, plugins=[ ConfluencePlugin() ], name=”ConfluenceAgent”, instructions=AGENT_INSTRUCTIONS )
# —- Main async logic —- async def stream_response(user_input, thread=None): html_blocks = [] full_response = [] function_calls = [] parsed_json_result = None completion = await client.chat.completions.create( model=”gpt-4o”, messages=[ {“role”: “system”, “content”: “You are a Judge of the content of user input. Anlyze the user’s input. If it asking to scrap internal COnfluence Page for a team then it is related to Program Management. If it is not related to Program Management, provide the reply but add ‘False|’ to the response. If it is related to Program Management, add ‘True|’ to the response.”}, {“role”: “user”, “content”: user_input} ], temperature=0.5 ) response_text = completion.choices[0].message.content.strip() print(“Response Text:”, response_text) if response_text.startswith(“False|”): response_text = response_text[6:] yield response_text, thread, [] # Return the response without any further processingelif response_text.startswith(“True|”): response_text = response_text[5:]function_calls = re.findall(r”(\w+\(.*?\))”, response_text) # Remove the function calls from the response text for call in function_calls: response_text = response_text.replace(call, “”) # If the response is related to Program Management, continue processing
async for response in agent.invoke_stream(messages=user_input, thread=thread):
print("Response:", response)
thread = response.thread
agent_name = response.name
for item in list(response.items):
if isinstance(item, FunctionCallContent):
pass # You can ignore this now
elif isinstance(item, FunctionResultContent):
if item.name == "summarize_confluence_data":
raw_content = item.result
extracted_json = extract_json_from_response(raw_content)
if extracted_json:
try:
parsed_json = json.loads(extracted_json)
yield parsed_json, thread, function_calls
except Exception as e:
st.error(f"Failed to parse extracted JSON: {e}")
else:
full_response.append(raw_content)
else:
full_response.append(item.result)
elif isinstance(item, StreamingTextContent) and item.text:
full_response.append(item.text)
#print("Full Response:", full_response)
# After loop ends, yield final result
if parsed_json_result:
yield parsed_json_result, thread, function_calls
else:
yield ''.join(full_response), thread, function_calls
# —- Streamlit UI Setup —- st.set_page_config(layout=”wide”) left_col, right_col = st.columns([1, 1]) st.markdown(“”” “””, unsafe_allow_html=True) # —- Main Streamlit app —- with left_col: st.title(“💬 Program Management Enabler AI”) st.write(“Ask me about different Wiley Program committed items.!”) st.write(“I can help you get the status of Features and Epics.”)
if "history" not in st.session_state:
st.session_state.history = \[\]
if "thread" not in st.session_state:
st.session_state.thread = None
if "charts" not in st.session_state:
st.session_state.charts = \[\] # Each entry: {"df": ..., "title": ..., "question": ...}
if "chart_dataframes" not in st.session_state:
st.session_state.chart_dataframes = \[\]
if st.button("🧹 Clear Chat"):
st.session_state.history = \[\]
st.session_state.thread = None
st.rerun()
# Input box at the top
user_input = st.chat_input("Ask me about your team's features...")
# Example:
team_selected = st.session_state.get("selected_team")
if st.session_state.get("selected_team") and user_input:
user_input = f"Team: {st.session_state.get('selected_team')}. {user_input}"
# Preserve chat history when program or team is selected
if user_input and not st.session_state.get("selected_team_changed", False):
st.session_state.selected_team_changed = False
if user_input:
df = pd.DataFrame()
full_response_holder = {"text": "","df": None}
with st.chat_message("assistant"):
response_container = st.empty()
assistant_text = ""
try:
chat_index = len(st.session_state.history)
response_gen = stream_response(user_input, st.session_state.thread)
print("Response generator started",response_gen)
async def process_stream():
async for update in response_gen:
nonlocal_thread = st.session_state.thread
if len(update) == 3:
content, nonlocal_thread, function_calls = update
full_response_holder\["text"\] = content
if isinstance(content, list):
data = json.loads(re.sub(r'\[\\x00-\\x1F\\x7F\]', '', content.replace("\`\`\`json", "").replace("\`\`\`","")))
df = pd.DataFrame(data)
df.columns = df.columns.str.lower()
print("\\n📊 Features Status Chart")
st.subheader("📊 Features Status Chart")
plot_bar_chart(df)
st.subheader("📋 Detailed Features Table")
st.dataframe(df)
chart_df.columns = chart_df.columns.str.lower()
full_response_holder\["df"\] = chart_df
elif (re.sub(r'\[\\x00-\\x1F\\x7F\]', '', content.replace("\`\`\`json", "").replace("\`\`\`","").replace(" ",""))\[0\] =="\[" and re.sub(r'\[\\x00-\\x1F\\x7F\]', '', content.replace("\`\`\`json", "").replace("\`\`\`","").replace(" ",""))\[-1\] == "\]"):
data = json.loads(re.sub(r'\[\\x00-\\x1F\\x7F\]', '', content.replace("\`\`\`json", "").replace("\`\`\`","")))
df = pd.DataFrame(data)
df.columns = df.columns.str.lower()
chart_df = pd.DataFrame(data)
chart_df.columns = chart_df.columns.str.lower()
full_response_holder\["df"\] = chart_df
else:
if function_calls:
st.markdown("\\n".join(function_calls))
flagtext = 'text'
st.session_state.thread = nonlocal_thread
try:
with st.spinner("🤖 AI is thinking..."):
flagtext = None
# Run the async function to process the stream
asyncio.run(process_stream())
# Update history with the assistant's response
if full_response_holder\["df"\] is not None and flagtext is None:
st.session_state.chart_dataframes.append({
"question": user_input,
"data": full_response_holder\["df"\],
"type": "chart"
})
elif full_response_holder\["text"\].strip():
# Text-type response
st.session_state.history.append({
"user": user_input,
"assistant": full_response_holder\["text"\],
"type": "text"
})
flagtext = None
except Exception as e:
error_msg = f"⚠️ Error: {e}"
response_container.markdown(error_msg)
if chat_index > 0 and "Error" in full_response_holder\["text"\]:
# Remove the last message only if it was an error
st.session_state.history.pop(chat_index)
# Handle any exceptions that occur during the async call
except Exception as e:
full_response_holder\["text"\] = f"⚠️ Error: {e}"
response_container.markdown(full_response_holder\["text"\])
chat_index = len(st.session_state.history)
#for item in st.session_state.history\[:-1\]:
for item in reversed(st.session_state.history):
if item\["type"\] == "text":
with st.chat_message("user"):
st.markdown(item\["user"\])
with st.chat_message("assistant"):
st.markdown(item\["assistant"\])
with right_col:st.title(“Select Wiley Program”)
team_list = {
"Program 1": \["Team 1", "Team 2", "Team 3"\],
"Program 2": \["Team 4", "Team 5", "Team 6"\]
}
selected_program = st.selectbox("Select the Program:", \["No selection"\] + list(team_list.keys()), key="program_selectbox")
selected_team = st.selectbox("Select the Agile Team:", \["No selection"\] + team_list.get(selected_program, \[\]), key="team_selectbox")
st.session_state\["selected_team"\] = selected_team if selected_team != "No selection" else None
if st.button("🧹 Clear All Charts"):
st.session_state.chart_dataframes = \[\]
chart_idx = 1
#if len(st.session_state.chart_dataframes) == 1:
for idx, item in enumerate(st.session_state.chart_dataframes):
#for idx, item in enumerate(st.session_state.chart_dataframes):
st.markdown(f"\*\*Chart {idx + 1}: {item\['question'\]}\*\*")
st.subheader("📊 Features Status Chart")
plot_bar_chart(item\["data"\])
st.subheader("📋 Detailed Features Table")
st.dataframe(item\["data"\])
chart_idx += 1
The Streamlit-based Program Management AI chatbot helps teams track project features and epics from Confluence pages. The application uses Semantic Kernel agents with OpenAI GPT-4o to scrape team-specific Confluence page content using Playwright. The state is used for authentication. The tool allows the selection of the Program and relevant team, and based on the selection, the user’s input will be responded to. With the Agentic AI feature, we can enable the LLM to be a real personal assistant. It can be powerful in limiting the LLM’s access to the data, but still leverage the LLM feature of restricted data. It is an example to understand the Agentic AI feature and how powerful it can be.