AudioVisual works in Monolith
This aims to be a best practices guide/tutorial to making audio-visual works in Monolith. The code described below will generate a square wave in the sound domain, and a square in the visual domain. An LFO signal is used to modulate both the frequency of the square wave and the scaling factor of the square.
Tangled Files
singing_square.scm
is main file. It's scheme code which
controls the show and renders sound.
(monolith:load "ugens.scm")
<<singing-square>>
<<render-block>>
(define (render-singing-square fps dur)
<<enable-offline-mode>>
<<realloc-blocksize>>
<<init-janet>>
<<setup-video>>
<<setup-sound>>
<<render>>
<<finish>>
)
(render-singing-square 60 10)
singing_square.janet
is a janet file used to draw the
square.
<<gfx-init>>
<<draw>>
<<render-frame-block>>
singing_square.sh
is a little shell script that runs
monolith
, converts wav to mp3 via lame
, and then
generates an mp4 file via ffmpeg
.
<<run-monolith>>
<<mp3-conversion>>
<<ffmpeg>>
How to run
First, tangle up all the code using worgle
from the
top-level directory:
worgle doc/audiovisual.org
Once rendered, run the generated shell script:
sh singing_square.sh
With any luck, a file called singing_square.mp4
will
appear.
Main setup
AV stuff is limited to offline rendering only. This is because there no way to render video in realtime.
Monolith will render a h264 video file (via h264, and an audio file (wav). These two are then stitched
together into an mp4 file via ffmpeg. The wav file can be
optionally converted to an mp3 file via lame
. This is
done because some video players don't support wav.
It usually ends up that sounds are created in a realtime configuration, then rendered with the video later. Video design tends to be more guess-and-wait-and-check.
Offline Mode
In order to set up the renderer, monolith must be
started in offline
mode. This can be done with
monolith:start-offline
.
(monolith:start-offline)
Block Reallocatoin
The internal graforge configuration is reallocated to be
a block size of 49 with monolith:realloc
. This is
done to make blocks line up with frames better, as the
default size of 64 does not work.
49 divides samples up evenly when the sampling rate is
44.1kHz (63 also works, may want to try that out).
(monolith:realloc 8 10 49)
Janet Setup
Graphics are pretty much always done using Janet, which is embedded in Monolith, and controlled
from inside of Scheme. Janet is initialized with
(monolith:janet-init)
.
It is best to have a top-level Janet file to import, then a top-level janet function to initialize stuff with.
(monolith:janet-init)
(monolith:janet-eval "(import singing_square)")
(monolith:janet-eval "(singing_square/gfx-init)")
h264 setup
A h264 video file is opened using the monolith:h264-begin
.
I tend to prefer using framerate of 60 fps.
(monolith:h264-begin "singing_square.h264" fps)
Patch Setup
Now a monolith patch is created and set up to render to a wavfile. More on this later.
(singing-square)
(wavout zz "singing_square.wav")
(out zz)
Rendering
Finally, the actual rendering happens. This is done
using the monolith:repeat
function, which calls a
function a certain number of times. Each time the function
is called, a new frame is written along with a block of
audio that encompasses the frame. Multiplying the intended
duration in seconds by the FPS will get the number of frames
needed to be rendered.
(monolith:repeat render-block (* dur fps))
The render-block
is a defined scheme function which is in
charge of rendering a frame of video, and a block of sound.
I will often put Janet in charge of rendering the frame
block instead of Scheme, so this function simply evaluates
a Janet function with no arguments.
(define (render-block)
(monolith:janet-eval "(singing_square/render-block)"))
Finishing up
After rendering, things are wrapped up with
monolith:h264-end
.
(monolith:h264-end)
That's the overall structure of the program!
Janet Stuff
gfx-init
is called from janet. at the very least, this
initializes the framebuffer.
(defn gfx-init []
(monolith/gfx-fb-init))
After lots of trial and error, I've found that the cleanest approach to for creating a frame-block is to draw and thencompute the block before appending. This is the best approach because it guarantees that something gets drawn on the first frame. Some AV latency issues may occur because of this, but there are some hacks with delays I do to correct this which are tolerable.
monolith/compute
is used to compute the block. The block
size is determined with sr / fps
, where sr
is the
sampling rate, and fps
is the frames per second. In other
words, this tells you how many samples of audio are needed
to compute one frame of video.
It's helpful to have some kind of progress. One thing to do is to keep track of and print the frame position at every second (every 60 frames, in this case).
(var framepos 0)
(var fps 60)
(var sr 44100)
(defn render-block []
(draw)
(if (= (% framepos fps) 0) (print framepos))
(monolith/compute (math/floor (/ sr fps)))
(monolith/h264-append)
(set framepos (+ framepos 1)))
The Singing Square
Visuals
What to draw? How 'bout a nice blue square. The scaling of the rectangle can be modulated by some signal in the audio domain, stored in channel 0.
(defn draw []
(var allports @[0x32 0x72 0x9c])
(var blue-romance @[0xd2 0xf9 0xde])
(def scale (monolith/chan-get 0))
(def size (+ 30 (* 80 scale)))
(def cx (/ (monolith/gfx-width) 2))
(def cy (/ (monolith/gfx-height) 2))
(def x (math/floor (- cx (/ size 2))))
(def y (math/floor (- cy (/ size 2))))
(monolith/gfx-fill
(blue-romance 0)
(blue-romance 1)
(blue-romance 2))
(monolith/gfx-rect-fill x y size size
(allports 0)
(allports 1)
(allports 2)))
Sound
What to squawk? How 'bout a nice filtered square oscillator, whose frequency is modulated by a sinusoidal LFO? A copy of this LFO will be stored in monolith channel 0 to scale the square mentioned previously.
(define (singing-square)
(biscale (sine 0.2 1) 0 1)
(bdup)
(monset zz 0)
(scale zz 48 60)
(sine 6 0.3)
(add zz zz)
(mtof zz)
(blsquare zz 0.5 0.5)
(butlp zz 1000)
<<some-reverb>>
)
Oh heck. Let's add some reverb too. Or as John Chowning (allegedly) calls it, "adding some ketchup".
(bdup)
(bdup)
(revsc zz zz 0.93 10000)
(bdrop)
(mul zz (ampdb -20))
(dcblock zz)
(add zz zz)
Running and Rendering
So! That's all the basic parts. The scheme file can be
rendered with monolith
from inside the doc
directory
with:
./monolith -l p/monolith.scm singing_square.scm
Two files will be generated, singing_square.h264
and
singing_square.wav
.
Encode wav to mp3 with lame:
lame --preset insane singing_square.wav
Then, stitch things into an mp4 file with ffmpeg:
ffmpeg -y -i singing_square.mp3 \
-i singing_square.h264 \
-vf format=yuv420p singing_square.mp4
Colorspace is manually converted to yuv420
colorspace,
because monolith by default saves to the yuv444
. yuv444 is
much better for pixel-art style videos where every pixel
counts, but it's not always supported in video players.
yuv420 is used to maximize portability.