Automate office1¶
About¶
Automation
to prepare visualization(s)
and slide deck at scale
.
┣━ code : User modified main program(s)
┃ ┣━ ao_batch.py : Create .pptm presentation in batch mode.
┃ ┣━ ao_explore.py : Create individual charts/graphs as in Visual Library.
┃
┣━ input : User defined directory
┃ ┣━ ____control.xlsm : Input to align slide(s) and their content.
┃ ┣━ data/ : Input .csv files.
┃ ┣━ img/ : Input image files.
┃ ┗━ templates/ : Reusable .potm template
┃
┣━ output: : Output file format
┃ ┗━ : <report-name>__<yyyymmdd>_<hhmm>.pptm
Automate office integrates excel vba
, R
and powerpoint presenation
elements using python
.
The structure is build around taking control of placeholder elements of powerpoint master slide.
The input to prepare powerpoint deck is structured through a control file.
Demo¶
examples\ao_batch.py
: Build slide deck.examples\ao_explore.py
: Build individual visualizations.
Target¶
Process transformation
by eliminating manual steps
to prepare visualization(s)
and slide deck(s)
.
- The automation targets to save
~30%
of the time taken to work on new presentation related tasks and~80%
of the time on repeat tasks. Standardization
to bring muchdesired consistency
toinsights generated by the team
.- Adopting
script based visualization
that are faster and equal to or better than existing options. Building product
thatsaves teams time
andbrings new innovation(s)
enablinggenerating better insights
faster.
New tasks
Repeat tasks
Design¶
Project is designed on following ground rules:
What you see is what you get
: Visualization or Slide option in Visual Library to guide in choosing available options.Limited to no learning curve
: Structure the project that anyone can use the automation irrespective of lack of knowledge of a language.For a task choose software that works best
: If excel works better vs any other software then we do that task in excel.Scale, Compatibility, flexibility
: Project is designed to be easliy scaled to include large number of visualization(s) providing intutive access to resources needed to build custom visualization and powerpoint presentations. We have tested the codebase across different system combinations (windows 10, 11; Microsoft Office 2016, 2019, 2021, 365, Python 310, 311, 312, 313, R 4.4.3) ensuring programs do not break on known environments. Anyone can contribute to the visualization library following simple rules as laid underdeveloper
section.End-to-end automated integration
: All components of the project are end-to-end automated, E.g. Visual library is autogenerated using python code so no manual maintenace or development. The automated components are integrated, E.g. The data processing takes color input from the PADAWA docs data table. Most of the components of the automation feed on already existing resources in other atuomation projects (padawa_docs
,searchx
).
Project uses following softwares:
.py
: Python to control the project..bas
: VBA to create visualizations in Excel..potm
: Powerpoint to create reusable slide templates..md
: Markdown integrates the project and make it easily accessible..R
: R to create visualizations not possible through Excel (vba) or Python respectively.
Usage¶
batch.py¶
In batch mode the sheet named calibration
in ____control.xlsx
file is used to control the number of slides, type of charts, formatting, text and image input to a presentation.
Just like explore.py
the analyst can see if certain visualizations or slides are present in Visual library
, pick and choose the charts and slides that are desired and reference the same in batch.py
file.
Developer
A developer needs to understand complex relationship between different elements of the project:
-
Visual library
- The code and output should be the same named file e.g.
bar.bas
and its corresponding `bar.jpg' file. - Only
.bas
files to have.jpg
output, other code can have other output types like `.png'. - Keep the variable names that control the charts consistent across codes irrespective of language.
- Keep things organized in folders with parent being
visaul_library
and child folders likecompare
,distribution
,flow
and so on. - Ensure no repeats in the slide generated by
calibration
sheet.
- The code and output should be the same named file e.g.
-
____control.xlsm
- Maintain the structure of the sheets in the control file.
- Understand that the
master
slide and output in folderoutput\learn
are related to how this file is derived.
-
batch.py
- Understand the dependencies and their relationship with all other elements that the program works with.
explore.py¶
Browse
: Check if a visualization that is expected exists in the Visual library.Copy or load data
: Copy data into clipboard from the data source(s).Transpose data
: Runexplore.py
Step2
to transpose data intoxyzv
convention.Paste transposed data to .xlsm file
: Runexplore.py 2.3
and paste transposed data tocell A1
in file generated when initializingexplore.py
.
Pipeline¶
CI/CD (Continuous Integration/Continuous Delivery) Pipeline
approach allows us to do as much as possible, an end to end integration
, eliminating manual data preparation steps in the process.
Future development¶
- Expand
Visual library
to include most of the frequently used visualizations. - Close project libraries for targeted use (right now code is open).
- Development of
adhoc.py
(currently code heavy) just in line withexplore.py
andbatch.py
. - Comments on slides to be powered by either of the feasible solution(s):
- Local
GPT
(Generative Pre-trained Transformer) engine. ML
(machine learning) based model trained on.ppt* title corpus
leveragingsearchx
.MCP
(Model Context Protocol) process deployment.
- Local
-
Please move around the template master elements as the output .pptx file is less organzied. ↩