Skip to content

Commit 93ee651

Browse files
Merge pull request #533 from SuffolkLITLab/copilot/setup-copilot-instructions
Add comprehensive GitHub Copilot instructions for documentation build process
2 parents fd132f0 + 1469240 commit 93ee651

File tree

15 files changed

+1026
-44
lines changed

15 files changed

+1026
-44
lines changed

.github/copilot-instructions.md

Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
# Document Assembly Line Documentation
2+
3+
The Document Assembly Line Documentation is a Docusaurus-based documentation website that combines automatic Python API documentation generation with manually authored guides for the Document Assembly Line project by Suffolk LIT Lab.
4+
5+
Always reference these instructions first and fallback to search or bash commands only when you encounter unexpected information that does not match the info here.
6+
7+
## Working Effectively
8+
9+
Bootstrap, build, and test the repository:
10+
- Install Python dependencies: `pip install docspec_python==2.2.1 git+https://github.com/nonprofittechy/pydoc-markdown@escape-brackets` -- takes 20 seconds
11+
- Clone external documentation sources (required for full build):
12+
```bash
13+
cd .. # go up one directory from repo root
14+
git clone https://github.com/SuffolkLITLab/docassemble-AssemblyLine.git
15+
git clone https://github.com/SuffolkLITLab/FormFyxer.git
16+
git clone https://github.com/SuffolkLITLab/docassemble-ALToolbox.git
17+
git clone https://github.com/SuffolkLITLab/docassemble-EFSPIntegration.git
18+
cd docassemble-AssemblyLine-documentation # return to repo root
19+
```
20+
- Generate Python API documentation: `pydoc-markdown` -- takes 2 seconds
21+
- Fix FormFyxer case sensitivity: `rm -rf docs/components/formfyxer` (if the directory exists)
22+
- Install Node.js dependencies: `PUPPETEER_SKIP_DOWNLOAD=true npm install` -- takes 20 seconds
23+
- Build the documentation: `npm run build` -- **WILL FAIL with PDF generation error** - this is expected behavior in CI environments. The HTML build succeeds and creates usable documentation in the `build/` directory.
24+
25+
Run the documentation website:
26+
- ALWAYS run the bootstrapping steps first.
27+
- Development server: `npm run start` -- starts on http://localhost:3000, takes 30 seconds to be ready
28+
- Production server: `npm run serve` -- serves built files on http://localhost:3000
29+
30+
## Critical Build Information
31+
32+
- **NEVER CANCEL** the `npm run build` command - it takes 60 seconds and may appear to hang during webpack compilation
33+
- **ALWAYS** use `PUPPETEER_SKIP_DOWNLOAD=true npm install` due to network restrictions preventing Chrome download
34+
- **PDF generation ALWAYS FAILS** in CI environments due to Puppeteer Chrome download restrictions - this is expected behavior
35+
- FormFyxer case sensitivity: pydoc-markdown creates duplicate `FormFyxer` and `formfyxer` directories, causing build warnings. If the build fails with case sensitivity errors, run: `rm -rf docs/components/formfyxer`
36+
- The build generates warnings about broken anchors - these are expected and do not prevent successful builds
37+
- Complete build workflow: `pydoc-markdown && rm -rf docs/components/formfyxer && npm run build`
38+
39+
## Validation
40+
41+
- ALWAYS manually validate any new code by starting the development server and navigating to affected pages
42+
- Test both development mode (`npm run start`) and production build (`npm run build && npm run serve`)
43+
- The site should load at http://localhost:3000 with a blue-themed homepage featuring "Open-source tools for court forms, guided interviews, and e-filing"
44+
- Navigation should work between homepage, Get started, and Documentation sections
45+
- Always run `npm run clear` before building if you encounter unexpected webpack errors
46+
47+
## Common Tasks
48+
49+
The following are outputs from frequently run commands. Reference them instead of viewing, searching, or running bash commands to save time.
50+
51+
### Repository root structure
52+
```
53+
.
54+
├── .git/
55+
├── .github/
56+
│ └── workflows/
57+
│ ├── deploy.yml
58+
│ └── test-deploy.yml
59+
├── .gitignore
60+
├── README.md
61+
├── babel.config.js
62+
├── docs/
63+
├── docusaurus.config.js
64+
├── package-lock.json
65+
├── package.json
66+
├── pydoc-markdown.yml
67+
├── sidebars.js
68+
├── src/
69+
└── static/
70+
```
71+
72+
### Key configuration files
73+
- `docusaurus.config.js`: Main Docusaurus configuration including plugins, themes, and site metadata
74+
- `pydoc-markdown.yml`: Configuration for extracting Python API documentation from external repositories
75+
- `sidebars.js`: Navigation structure for documentation sections
76+
- `package.json`: Node.js dependencies and scripts
77+
78+
### Build process sequence
79+
1. Python dependencies installation (~20 seconds)
80+
2. External repositories cloning (~5 seconds)
81+
3. Python API documentation generation via pydoc-markdown (~2 seconds)
82+
4. Node.js dependencies installation (~20 seconds)
83+
5. Docusaurus build compilation (~60 seconds)
84+
85+
### Known working commands
86+
- `npm run clear` -- clears build cache, takes 1 second
87+
- `npm run start` -- development server, ready in 30 seconds
88+
- `npm run build` -- production build, takes 60 seconds
89+
- `npm run serve` -- serves built files
90+
- `npm run swizzle` -- customize Docusaurus components
91+
- `npm run deploy` -- deploys to GitHub Pages (requires GIT_USER and USE_SSH)
92+
93+
### Development workflow
94+
1. Make changes to markdown files in `docs/` directory
95+
2. Test with `npm run start` for live reload during development
96+
3. Build with `npm run build` to test production build
97+
4. Check for build warnings and broken links
98+
5. Test navigation and content rendering with `npm run serve`
99+
100+
### External dependencies
101+
The site pulls API documentation from these repositories:
102+
- `docassemble-AssemblyLine`: Main Assembly Line framework
103+
- `FormFyxer`: PDF and DOCX manipulation tools
104+
- `docassemble-ALToolbox`: Additional utility functions
105+
- `docassemble-EFSPIntegration`: E-filing integration components
106+
107+
### Troubleshooting
108+
- If npm install fails: use `PUPPETEER_SKIP_DOWNLOAD=true npm install`
109+
- If build fails with FormFyxer case errors: `rm -rf docs/components/formfyxer`
110+
- If webpack compilation appears stuck: wait at least 60 seconds before investigating
111+
- If development server fails to start: run `npm run clear` first
112+
- **PDF generation always fails in CI** due to Chrome/Puppeteer restrictions - this is normal and expected
113+
- For local PDF generation: ensure Chrome is installed and accessible to Puppeteer, or disable `autoBuildPdfs` in `docusaurus.config.js`
114+
115+
### CI/CD Pipeline
116+
GitHub Actions automatically:
117+
1. Installs Python 3.11 and required packages
118+
2. Clones external documentation repositories
119+
3. Runs pydoc-markdown for API documentation
120+
4. Installs Node.js 20 and npm dependencies
121+
5. Builds the site with `npm run build`
122+
6. Deploys to GitHub Pages on main branch pushes
123+
124+
Test deployment runs the same process for pull requests without the final deploy step.

docs/components/ALToolbox/llms.md

Lines changed: 66 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,21 +7,22 @@ title: ALToolbox.llms
77

88
A light wrapper on the OpenAI chat endpoint.
99

10-
Includes support for token limits, error handling, and moderation queue.
11-
12-
It is also possible to specify an alternative model, and we support GPT-4-turbo's JSON
13-
mode.
14-
15-
As of today (1/2/2024) JSON mode requires the model to be set to "gpt-4-1106-preview" or "gpt-3.5-turbo-1106"
10+
Includes support for token limits, minimal error handling, and moderation.
1611

1712
**Arguments**:
1813

1914
- `system_message` _str_ - The role the chat engine should play
2015
- `user_message` _str_ - The message (data) from the user
2116
- `openai_client` _Optional[OpenAI]_ - An OpenAI client object, optional. If omitted, will fall back to creating a new OpenAI client with the API key provided as an environment variable
2217
- `openai_api` _Optional[str]_ - the API key for an OpenAI client, optional. If provided, a new OpenAI client will be created.
23-
- `temperature` _float_ - The temperature to use for the GPT-4-turbo API
24-
- `json_mode` _bool_ - Whether to use JSON mode for the GPT-4-turbo API
18+
- `temperature` _float_ - The temperature to use for the GPT API
19+
- `json_mode` _bool_ - Whether to use JSON mode for the GPT API. Requires the word `json` in the system message, but will add if you omit it.
20+
- `model` _str_ - The model to use for the GPT API
21+
- `messages` _Optional[List[Dict[str, str]]]_ - A list of messages to send to the chat engine. If provided, system_message and user_message will be ignored.
22+
- `skip_moderation` _bool_ - Whether to skip the OpenAI moderation step, which may save seconds but risks banning your account. Only enable when you have full control over the inputs.
23+
- `openai_base_url` _Optional[str]_ - The base URL for the OpenAI API. Defaults to value provided in the configuration or "https://api.openai.com/v1/".
24+
- `max_output_tokens` _Optional[int]_ - The maximum number of tokens to return from the API. Defaults to 16380.
25+
- `max_input_tokens` _Optional[int]_ - The maximum number of tokens to send to the API. Defaults to 128000.
2526

2627

2728
**Returns**:
@@ -44,7 +45,7 @@ Extracts fields from text.
4445

4546
#### match\_goals\_from\_text
4647

47-
Read's a user's message and determines whether it meets a set of goals, with the help of an LLM.
48+
Reads a user's message and determines whether it meets a set of goals, with the help of an LLM.
4849

4950
**Arguments**:
5051

@@ -232,3 +233,59 @@ Returns the next unsatisfied goal, along with a follow-up question to ask the us
232233
233234
Returns a draft response that synthesizes the user's responses to the questions.
234235
236+
#### provide\_feedback
237+
238+
Returns feedback to the user based on the goals they satisfied.
239+
240+
## IntakeQuestion Objects
241+
242+
```python
243+
class IntakeQuestion(DAObject)
244+
```
245+
246+
A class to represent a question in an LLM-assisted intake questionnaire.
247+
248+
**Attributes**:
249+
250+
- `question` _str_ - The question to ask the user
251+
- `response` _str_ - The user's response to the question
252+
253+
## IntakeQuestionList Objects
254+
255+
```python
256+
class IntakeQuestionList(DAList)
257+
```
258+
259+
Class to help create an LLM-assisted intake questionnaire.
260+
261+
The LLM will be provided a free-form set of in/out criteria (like that
262+
provided to a phone intake worker), an initial draft question from the user,
263+
and then guide the user through a series of follow-up questions to gather only
264+
enough information to determine if the user meets the criteria.
265+
266+
In/out criteria are often pretty short, so we do not make or support
267+
embeddings at the moment.
268+
269+
**Attributes**:
270+
271+
- `criteria` _Dict[str, str]_ - A dictionary of criteria to match, indexed by problem type
272+
- `problem_type_descriptions` _Dict[str, str]_ - A dictionary of descriptions of the problem types
273+
- `problem_type` _str_ - The type of problem to match. E.g., a unit/department inside the law firm
274+
- `initial_problem_description` _str_ - The initial description of the problem from the user
275+
- `initial_question` _str_ - The original question posed in the interview
276+
- `question_limit` _int_ - The maximum number of follow-up questions to ask the user. Defaults to 10.
277+
- `model` _str_ - The model to use for the GPT API. Defaults to gpt-4.1.
278+
- `max_output_tokens` _int_ - The maximum number of tokens to return from the API. Defaults to 4096
279+
- `llm_role` _str_ - The role the LLM should play. Allows you to customize the script the LLM uses to guide the user.
280+
We have provided a default script that should work for most intake questionnaires.
281+
- `llm_user_qualifies_prompt` _str_ - The prompt to use to determine if the user qualifies. We have provided a default prompt.
282+
- `out_of_questions` _bool_ - Whether the user has run out of questions to answer
283+
- `qualifies` _bool_ - Whether the user qualifies based on the criteria
284+
285+
#### need\_more\_questions
286+
287+
Returns True if the user needs to answer more questions, False otherwise.
288+
289+
Also has the side effect of checking the user's most recent response to see if it satisfies the criteria
290+
and updating both the next question to be asked and the current qualification status.
291+

docs/components/ALToolbox/misc.md

Lines changed: 112 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ you more control over the icon that is inserted.
2828

2929
- `icon` - a string representing a fontawesome icon. The icon needs to be in the
3030
[free library](https://fontawesome.com/search?o=r&m=free).
31-
- `color` - can be any [Bootstrap color variable](https://getbootstrap.com/docs/5.0/utilities/colors/).
31+
- `color` - can be any [Bootstrap color variable](https://getbootstrapc.mo/docs/4.0/utilities/colors).
3232
For example: `primary`, `secondary`, `warning`
3333
- `color_css` - allows you to use a CSS code to represent the color, e.g., `blue`, or ``fff`` for black
3434
- `size` - used to control the [fontawesome size](https://fontawesome.com/v6.0/docs/web/style/size)
@@ -148,3 +148,114 @@ of privileges.
148148

149149
HTML for a grid of buttons
150150

151+
#### none\_to\_empty
152+
153+
If the value is None or "None", return a DAEmpty value. Otherwise return the value.
154+
155+
This is useful for filling in a template and to prevent the word None from appearing in the output. For example,
156+
when handling a radio button that is not required and left unanswered.
157+
158+
A DAEmpty value appears as an empty string in the output. You can also safely transform it or use any method on it
159+
without raising an error.
160+
161+
**Arguments**:
162+
163+
- `val` - the value to check
164+
165+
**Returns**:
166+
167+
a DAEmpty if the value is None, otherwise the value
168+
169+
#### option\_or\_other
170+
171+
If the variable is set to 'Other', return the value of the 'other' variable. Otherwise return the value of the variable.
172+
173+
This is useful for filling in a template and to prevent the word 'Other' from appearing in the output.
174+
175+
**Arguments**:
176+
177+
- `variable_name` - the name of the variable to check
178+
- `other_variable_name` - the name of the variable to return if the value of the first variable is 'Other'
179+
180+
**Returns**:
181+
182+
the value of the variable if it is not 'Other', otherwise the value of the other variable
183+
184+
#### true\_values\_with\_other
185+
186+
Return a list of values that are True, with the value of the 'other' variable appended to the end of the list.
187+
188+
This is useful for filling in a template and to prevent the word 'Other' from appearing in the output.
189+
190+
**Arguments**:
191+
192+
- `variable` - the dictionary of variables to check
193+
- `other_variable_name` - the name of the variable (as a string) to return if the value of the first variable is 'Other'
194+
195+
**Returns**:
196+
197+
a list of values that are True, with the value of the 'other' variable appended to the end of the list.
198+
199+
#### include\_a\_year
200+
201+
Validates whether the input text contains at least one 4-digit sequence
202+
that occurs within a range of ~ 200 years, indicating a valid "year"
203+
for an event that should be reported on most court forms, like a birthdate
204+
or a moving date.
205+
206+
Returns True if found, otherwise raises a DAValidationError.
207+
208+
#### is\_leap\_year
209+
210+
Helper function for `age_in_years` to determine if a year is a leap year.
211+
212+
**Arguments**:
213+
214+
- `year` - The year to check.
215+
216+
**Returns**:
217+
218+
True if the year is a leap year, False otherwise.
219+
220+
#### age\_in\_years
221+
222+
Calculate the age in years from a date (treated like a date of birth).
223+
224+
**Arguments**:
225+
226+
- `the_date` - A string or DADateTime object representing the date of birth.
227+
228+
**Returns**:
229+
230+
The age in years as an integer.
231+
232+
#### format\_date\_if\_defined
233+
234+
Format a date string if it is defined, otherwise return an empty string.
235+
236+
Passes all additional arguments to the `format_date` function.
237+
238+
**Arguments**:
239+
240+
- `date_object_name` - The date string to format.
241+
- `*pargs` - Additional positional arguments to pass to `format_date`.
242+
- `default` - A default string to return if `date_object_name` is not defined.
243+
- `**kwargs` - Additional keyword arguments to pass to `format_date`. E.g., format="yyyy-MM-dd"
244+
245+
246+
**Returns**:
247+
248+
A formatted date string if `date_object_name` is defined, otherwise an empty string.
249+
250+
251+
**Example**:
252+
253+
254+
>>> format_date_if_defined("users[0].birthdate", format='yyyy-MM-dd')
255+
256+
Returns a formatted date string if "users[0].birthdate" is defined, otherwise returns an empty string.
257+
258+
>>> format_date_if_defined("users[0].birthdate", default="No date provided", format='yyyy-MM-dd ')
259+
260+
Returns a formatted date string followed by one space if "users[0].birthdate" is defined, otherwise returns "No date provided". (Note space is added to the format="..." parameter)
261+

0 commit comments

Comments
 (0)