Core

The core module contains the Agent, STT, TTS, as well as dataclasses related to the Agent. The Agent is responsible for the overall control flow of the application. It is the main entry point for the user and the main interface to the other modules. It also takes care of the proactivity of the application.

Agent

graph LR;

start_agent --> greet_user;
greet_user --> check_for_proactivity;
check_for_proactivity --> trigger_proactivity;
trigger_proactivity --> get_user_input;
check_for_proactivity --> get_user_input;
get_user_input --> calculate_best_match;
calculate_best_match --> trigger_use_case;
trigger_use_case --> check_for_proactivity;

`Agent(get_mic=False)`

Class to handle speech to handle main functionality of the assistant

The core functionality of the assistant is to handle speech-to-text conversion (stt), text-to_speech (tts) conversion, calculate the best match for the parsed text, greet the user, trigger the right use case, and handle proactivity.

Parameters:

Name	Type	Description	Default
`get_mic`	`bool, optional`	Boolean if the speech to text class should first ask for the microphone to use. By default `False`.	`False`

Attributes:

Name	Type	Description
`assistant_name`	`str`	The name of the assistant
`quotes`	`pd.DataFrame`	DataFrame storing the use cases and functionality combinations
`user`	`User`	User class to store the user information (eg., name, age)
`stt`	`SpeechToText`	Speech to text class to handle speech-to-text conversion
`tts`	`TextToSpeech`	Text to speech class to handle text-to-speech conversion
`log_proactivity`	`LogProactivity`	Log proactivity class to handle the logging of proactivity
`uc_general`	`GeneralUseCase`	General use case class to handle general use cases
`uc_navigation`	`NavigationUseCase`	Navigation use case class to handle navigation use cases
`uc_event`	`EventUseCase`	Event use case class to handle event use cases
`uc_sport`	`SportUseCase`	Sport use case class to handle sport use cases

`_check_proactivity(test_proactivity=None)`

Checks if there are any updates which should be announced to the user

Checks every 60 seconds if there are any updates which should be announced to the user. There is an additional option to set a separate interval for each use case.

Proactivity IDs

The following table shows the IDs for the proactivity.

ID	Use Case
1	Event
2	Morning Briefing
3	Sport
4	Navigation

Parameters:

Name	Type	Description	Default
`test_proactivity`	`int \| None, optional`	A integer between `1` and `5` which triggers the proactivity for the corresponding use case.	`None`

`_evaluate_use_case(parsed_text)`

Evaluates the parsed text to trigger the correct use case

Parameters:

Name	Type	Description	Default
`parsed_text`	`str`	The voice input of the user parsed to lower case string	required

`_get_best_match(parsed_text, threshold=0.7)`

Find the best match for the parsed text

Function calculates the similarity between the parsed text and the use cases.

TODO: Add tokenization and stop words
TODO: Watch if the default threshold is too high

self.quotes DataFrame

The self.quotes consists of three columns: use_case, choice and phrase. We use the use_case and choice column for the chain-of-responsibility pattern to map the best match to the final function. The phrase column contains multiple phrases which are going to be compared to the parsed text.

	use_case	choice	phrase
0	morningBriefing	newsSummary	whats going on
1	morningBriefing	newsSummary	morning briefing
2	events	eventSummary	what is going on
3	navigation	dhbw	dhbw
4	navigation	dhbw	i need to get to the dhbw
5	navigation	hpe	i need to get to the hpe

Parameters:

Name	Type	Description	Default
`parsed_text`	`str`	The parsed text which should be matched to a use case.	required
`threshold`	`float, optional`	The threshold which is used to determine if the similarity is high enough to be considered. The value needs to be between 0 and 1. By default `0.7`.	`0.7`

Raises:

Type	Description
`ValueError`	If the threshold is not between 0 and 1.

Returns:

Type	Description
`BestMatch`	Returns a object with the use case, the selected endpoint within the use case (choice), the similarity, and the parsed text.

`_greeting()`

Function to greet the user.

Depending on the time of the day, the assistant greets the user with a different greeting.

`main(test_proactivity=None)`

Main function to interact with the user

The agent function is the main function of the assistant. It first greets the user and then checks proactively if there are updates for the user. If thats not the case, it will start listening for user input in 60 second intervals. If the user input is not empty, it will execute the use case function for proactivity.

The threading library seems to be not compatible with some python version (documentation). Therefore it will be removed and the agent will be executed in a single thread.

TODO: Add hotword detection

Parameters:

Name	Type	Description	Default
`test_proactivity`	`int \| None, optional`	A integer between `1` and `5` which triggers the proactivity for the corresponding use case.	`None`

User Interaction

`SpeechToText(get_mic)`

Class to convert speech to text.

Initializes the speech to text class.

TODO: Think about a better way to handle the case that the microphone_index is not required

Parameters:

Name	Type	Description	Default
`get_mic`	`bool`	If the speech to text class should first get the microphone index.	required

Attributes:

Name	Type	Description
`recognizer`	`sr.Recognizer`	The speech recognition object.
`microphone_index`	`int \| None`	The index of the microphone which should be used.

`check_if_yes()`

First gets the user input and then checks if the user said yes.

TODO: Fix that yes is not recognized well

Returns:

Type	Description
`bool`	Boolean if the user said yes.

`convert_audio_file(audio_file)`

Converts an audio file to text.

TODO: For now this function is only for testing purposes

Parameters:

Name	Type	Description	Default
`audio_file`	`str \| Path`	The path to the audio file which should be converted.	required

Returns:

Type	Description
`str \| None`	The parsed text or `None` if the parsing failed.

`convert_speech(line_above=False)`

First records an audio file an then pareses it to text.

When the function does not detect any speech for 60 seconds it will timeout and return None.

TODO: Maybe use adjust_for_ambient_noise
TODO: Add function to cancel the request without quitting the program

Parameters:

Name	Type	Description	Default
`line_above`	`bool, optional`	If a new line should be printed before the user input. By default `False`.	`False`

Returns:

Type	Description
`str \| None`	The parsed text or None if no text could be parsed.

`TextToSpeech()`

Class to convert text to speech.

Initializes the text to speech class.

TODO: Add Attributes section

Attributes:

Name	Type	Description
`engine`	`pyttsx3.Engine`	The text to speech engine.

`convert_text(text, optimize_time=True, optimize_numbers=True, line_above=False)`

Converts text to speech

Parameters:

Name	Type	Description	Default
`text`	`str`	The Text which should be converted to speech.	required
`optimize_time`	`bool, optional`	If the time should be optimized for speech. Will replace `22:00` with `22 o'clock` and `22:30` with `22 30`. By default `True`.	`True`
`optimize_numbers`	`bool, optional`	If the numbers should be optimized for speech. Will replace `22.` with `22 .`. By default `True`.	`True`
`line_above`	`bool, optional`	If a new line should be printed before the bot input. By default `False`.	`False`

`optimize_text(text, optimize_time, optimize_numbers)`

Optimizes text with time indications (in HH:MM format) for speech.

Parameters:

Name	Type	Description	Default
`text`	`str`	The text which should be optimized.	required
`optimize_time`	`bool, optional`	If the time should be optimized for speech. Will replace `22:00` with `22 o'clock` and `22:30` with `22 30`. By default `True`.	required
`optimize_numbers`	`bool, optional`	If the numbers should be optimized for speech. Will replace `22.` with `22 .`. By default `True`.	required

Returns:

Type	Description
`str`	The optimized text.

Dataclasses

`Address` `dataclass`

Dataclass to store the address of a user.

Attributes:

Name	Type	Description
`street`	`str`	The street of the user.
`city`	`str`	The city of the user.
`zip_code`	`int`	The zip code of the user.
`country`	`str`	The country of the user.
`vvs_id`	`str`	The VVS ID of the user.

`BestMatch` `dataclass`

Dataclass to store the best match for a given user input.

Attributes:

Name	Type	Description
`use_case`	`str`	The name of the use case.
`function_key`	`str`	The key of the function which should be called.
`similarity`	`float`	The similarity between the user input and the best match.
`parsed_text`	`str`	The parsed text from the user input.

`Favorites` `dataclass`

Dataclass to store the favorites of a user.

For example the favorite stocks, sports teams, etc.

Parameters:

Name	Type	Description	Default
`stocks`	`list[str]`	The favorite stocks of the user.	required
`league`	`str`	The favorite league of the user.	required
`team`	`str`	The favorite team of the user.	required
`news_country`	`str`	The country the user wants to receive news from.	required
`news_keywords`	`list[str]`	The favorite news keywords of the user.	required
`wakeup_time`	`datetime`	The wakeup time of the user.	required

`LogProactivity` `dataclass`

A class to keep track of the last time proactivity was triggered

Attributes:

Name	Type	Description
`last_check`	`datetime`	The last time proactivity was triggered
`last_event_check`	`datetime`	The last time the event use case was triggered
`last_morning_briefing_check`	`datetime`	The last time the morning briefing was triggered
`last_wakeup_check`	`datetime`	The last time the wakeup in morning briefing was triggered
`last_sport_check`	`datetime`	The last time the sport use case was triggered
`last_navigation_check`	`datetime`	The last time the navigation use case was triggered

`Possessions` `dataclass`

Dataclass to store the possessions of a user.

Attributes:

Name	Type	Description
`bike`	`bool`	Does the user own a bike?
`car`	`bool`	Does the user own a car?

`User` `dataclass`

Dataclass supposed to store the user data

Attributes:

Name	Type	Description
`name`	`str`	The name of the user.
`age`	`int`	The age of the user.
`address`	`Address`	The address of the user.
`possessions`	`Possessions`	The possessions of the user.
`favorites`	`Favorites`

Core

Agent

Agent(get_mic=False)

_check_proactivity(test_proactivity=None)

_evaluate_use_case(parsed_text)

_get_best_match(parsed_text, threshold=0.7)

_greeting()

main(test_proactivity=None)

User Interaction

SpeechToText(get_mic)

check_if_yes()

convert_audio_file(audio_file)

convert_speech(line_above=False)

TextToSpeech()

convert_text(text, optimize_time=True, optimize_numbers=True, line_above=False)

optimize_text(text, optimize_time, optimize_numbers)

Dataclasses

Address dataclass

BestMatch dataclass

Favorites dataclass

LogProactivity dataclass

Possessions dataclass

User dataclass

`Agent(get_mic=False)`

`_check_proactivity(test_proactivity=None)`

`_evaluate_use_case(parsed_text)`

`_get_best_match(parsed_text, threshold=0.7)`

`_greeting()`

`main(test_proactivity=None)`

`SpeechToText(get_mic)`

`check_if_yes()`

`convert_audio_file(audio_file)`

`convert_speech(line_above=False)`

`TextToSpeech()`

`convert_text(text, optimize_time=True, optimize_numbers=True, line_above=False)`

`optimize_text(text, optimize_time, optimize_numbers)`

`Address` `dataclass`

`BestMatch` `dataclass`

`Favorites` `dataclass`

`LogProactivity` `dataclass`

`Possessions` `dataclass`

`User` `dataclass`