SING4DATASCIENCE/HSSingLog session contains number of
processes.
Generation of the playlist
Sing session
Result tabulation
Metadata tabulation
Result posting
This article covers processes and the tools used for the
process.
Generation
of the playlist
Playlist generation for the sing session is automated, and
has following features.
Automatic lottery selection system
Fixed song system (currently applies to the 3
songs.)
Automatic lottery selection system uses the number of past
entries and elapsed time since last session as the weighting element to
simulate lottery for the songs to be included in a playlist. With this
weighting, the system select songs with fewer number of sessions, and ones with
longer time elapsed since last session.
This is implemented using PowerShell, with lottery logic as a binary, lottery process as a script. (Using PowerShell 7.)
Typical execution of the automatic lottery selection system
Sing
session
After generation of the playlist is complete, the system
generates singing directive sheet. This sheet contains order of songs to be
sung, and also shows currently available metadata for these songs.
Singing directive sheet
Result
tabulation
Once the singing session is complete, the results are
tabulated. From the singing directive sheet, Title ID column is extracted, and
is tabulated using tabulation sheet.
Tabulation sheet
Once this tabulation process is complete, using the Get-HSLCompiledFile cmdlet, verifications (to make sure scoring doesn’t have any irregularites) and combine the title information, which generates the results posted on GitHub.
Typical execution of the Get-HSLCompiledFile cmdlet
Metadata
tabulation
When the song is missing metadata (either this is a new
song, or ones that metadata is not yet filled in), these are entered into the
singing directive sheet, and then the Update-HSLCsvData cmdlet is used to
update. This cmdlet updates 3 metadata CSV files. (These 3 metadata files are
authors, release date, and tempo.)
Typical execution of Update-HSLCsvData cmdlet
Result
posting
For the posting data, a Python script is used. Chiefly, these scripts contains the following 2 functions. (These uses pandas, and Matplotlib.)
Generation of the text for the posts
Generation of the graph element for the posts
Typical execution of the result posting generation
After installing Anaconda, you can invoke conda init to initialize settings for PowerShell, including PowerShell Core. There is the problem of this approach as Anaconda is initialized every time PowerShell is launched. This is OK if you are using Python every time you launch PowerShell, however, with my case, it is often the case that I use PowerShell and no Python. This is inefficient.
I’ve decided to create a Cmdlet called Initialize-Conda. Here’s the content. You can replace C:/Path/to/Conda to your Anaconda location to adopt this to your use. (Available for Windows and Linux in Gist.)
function Initialize-Conda
{
$CONDA_ROOT_DIR = "C:/Path/to/Conda" # Change this
[System.Environment]::SetEnvironmentVariable("CONDA_EXE", "$CONDA_ROOT_DIR/Scripts/conda.exe", [System.EnvironmentVariableTarget]::Process)
[System.Environment]::SetEnvironmentVariable("_CE_M", "", [System.EnvironmentVariableTarget]::Process)
[System.Environment]::SetEnvironmentVariable("_CE_CONDA", "", [System.EnvironmentVariableTarget]::Process)
[System.Environment]::SetEnvironmentVariable("_CONDA_ROOT", "$CONDA_ROOT_DIR", [System.EnvironmentVariableTarget]::Process)
[System.Environment]::SetEnvironmentVariable("_CONDA_EXE", "$CONDA_ROOT_DIR/Scripts/conda.exe", [System.EnvironmentVariableTarget]::Process)
Import-Module -Scope Global "$Env:_CONDA_ROOT/shell/condabin/Conda.psm1"
conda activate base
}
Environment variable initialization appears bit long — this is because scope of the variable does not get passed outside of the script scope. Therefore, I had to use .NET’s SetEnvironmentVarible to initialize environmental variable for the scope of the process. (Under Linux, CONDA_EXE and _CONDA_EXE should be changed to $CONDA_ROOT_DIR/bin/conda)
It is also possible to use Add-CondaEnvironmentToPrompt to prefix the prompt with the current Conda environment. I’ve omitted this as I found this to be unreliable.
Here’s the corresponding PSD1 file. (Available from Gist.)
Take these two files, Conda.psm1, and Conda.psd1, and place them under Documents\PowerShell\Modules\Conda (Under Linux, it’s ~/.local/share/powershell/Modules/Conda) and then you should be able to launch Initialize-Conda.
I have been staffing at Sakura-Con Guest Relations since 2002, and throughout years, I have been experimenting with numerous technologies to support event planning. Previously, I have used Org-mode, some small Python scripts, and this year, I have been constructing the system using PowerShell Core. I have been using PowerShell to various applications, like tabulating Karaoke data, I wanted to expand the use cases for this area as well.
Motivations
My goal for this year to build a centralize solution to:
Provide the automated way of tabulating event schedule data and create a data set that I can work with.
Provide conversion from the event schedule data to other useful data type, such as calendar format.
The reason I have picked PowerShell Core is:
I wanted the solution that works across multiple platforms. I use Linux and Windows. Big plus if it runs on Android. What makes this great is that PowerShell Core can run on Android environment through app like UserLAnd. (If you are doing this on Android, the keyboard such as this one helps a lot.)
I wanted something that I can use offline. During the event, internet connection may be degraded to the point it’s unusable.
Components and Setup
First module I have created, is a data fetcher to retrieve schedule data from the scheduling system. The scheduling system, unfortunately lacked API that allowed me to retrieve the data from the system, and I used Selenium to retrieve the data.
Retrieving data using Selenium (Scheduling solution name is obscured due to the confidentiality reasons.)
Obtaining data through Selenium would take about 30 seconds — there are about 800 entries. (though, I regularly interact about 1/8 of it.) The Selenium module navigates to the appropriate calendar ranges through UI, and grabs the result out from the DOM. This is coded in C#. Since Selenium provides driver for both Linux and Windows, same module works works cross-platform as well. (Probably not on Android, but if there’s a driver for Android, this potentially can be done on Android as well.)
The system exposes the following data, and is exported as array:
TypeName: SakuraCon.Relations.Data.Event
Name MemberType Definition
---- ---------- ----------
End Time AliasProperty End Time = EndTime
Event Title AliasProperty Event Title = EventTitle
Start Time AliasProperty Start Time = StartTime
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
EndTime Property datetime EndTime {get;set;}
EventId Property string EventId {get;set;}
EventTitle Property string EventTitle {get;set;}
Notes Property string Notes {get;set;}
Rating Property string Rating {get;set;}
StartTime Property datetime StartTime {get;set;}
Type Property string Type {get;set;}
Venue Property string Venue {get;set;}
Duration ScriptProperty System.Object Duration {get=($this.EndTime - $this.StartTime);}
Usually, I export this data into clixml — this way, I can retrieve the contents later as needed. PowerShell provides the convenient cmdlet to do this.
$schedule | Export-Clixml schedule.xml
Importing this is easy:
$schedule = Import-Clixml schedule.xml
This would allow offline access to the data offline, as exported XML is essentially the snapshot of obtained data.
From this, I can easier export this as CSV (used for tabulating schedule information) and as a iCalendar format that can be imported into Google Calendar.
Something I like about this data structure is that I can use Where-Object to retrieve desired information. If I want all the event that happens in the room 6C, I would query:
This will retrieve the data about the next and current event that taking place.
Other PowerShell Core Applications
Schedule is not only place I have utilized PowerShell Core. Another area I use PowerShell Core is in area of generating form letters. Since I have the letter source as LaTeX source, this is matter of passing the personalized arguments through parameters. Since PowerShell contains the functionality to convert a CSV into data structure, the list of the names are populated as CSV.
function process {
param(
[Parameter(position = 0, mandatory = $true)]
$Name,
[Parameter(position = 1, mandatory = $true)]
$Identifier
)
$OutputEncoding = [System.Text.Encoding]::GetEncoding('shift_jis')
$docGuid = [guid]::NewGuid().ToString()
uplatex -kanji=utf8 --no-guestt-input-enc -jobname="Welcome_$Identifier" "\newcommand\scguestname{$Name}\input{2019_welcome.tex}"
dvipdfmx "Welcome_$Identifier"
Remove-Item @("Welcome_${Identifier}.aux", "Welcome_${Identifier}.log", "Welcome_${Identifier}.out", "Welcome_${Identifier}.dvi") -ErrorAction SilentlyContinue
$OutputEncoding = New-Object System.Text.ASCIIEncoding
}
$names = Import-Csv welcome.csv
foreach($item in $names)
{
$currentloc = (Get-Location).Path
$identifier = $item.Name -replace " ","_"
Write-Host "$identifier"
$fileExist = Test-Path (Join-Path -Path $currentloc -ChildPath "Welcome_${identifier}.pdf")
Pop-Location
if($fileExist -eq $False)
{
process -Identifier $identifier -Name $item.JapaneseName
}
else {
Write-Host -BackgroundColor Red "File exists, please delete a PDF file if you really need update this file."
}
Because of the way the command line arguments passed, this is the area where I had struggle running this under Windows (because of the way the frontend handles character encoding) and had to use Windows Subsystem for Linux (WSL) to generate Japanese letters; but since PowerShell Core is available both for Linux and Windows, same components and scripts are used unmodified.
This will take about 30 seconds to generate 20 letters.
Conclusion
PowerShell Core provided, cross-platform and consistent environment to support data wrangling within limited area of relations tasks I was handling.
I am planning to improve the system to support schedule conflict, workload verification as well as staffing.
I’ve consolidated various elisp scripts for Emacs I have publicized on Gist into a project, now called hideki-emacs-utilities. Also specialized anime terms SKK dictionaries are now available through skk-anime-dictionary.
I’m not particularly, LISPer, but I do code some Emacs LISP for my convinience. Just to pick a few…
Create Random Buffer is a script to create some random buffer. I’m actually wondering why something like this doesn’t exist as default. (well, there’s scratch buffer, but…) Useful for experimenting.