This book deals with the creation of the algorithm backbone that enables a computer to perceive humans in a monitored space, by processing the signals that humans use in order to perform a task, i.e., audio and video. To do so, computers use sensors and algorithms to detect and track multiple intera[...]