A video is converted to a series of ASCII images. The video is first processed by downscaling and turning it into grayscale. Contrast is increased by means of contrast limited adaptive histogram equalization (CLAHE). Each pixel value is then mapped to an ASCII character based on its relative brightness with a lookup table. Then, the ASCII characters are drawn onto images with PIL and stitched together again with OpenCV2.
Render times at 10% resolution of a 1080p video at 30 fps (rendered at 15 fps) are about equal to the duration of the video. The main bottleneck in this process is due to drawing the ASCII text onto images with PIL.
Sound and narration is overlaid with external video editing software.
Ещё видео!