Path Planning using Millimetre Wave Autonomous Vehicle Radar
Path planning in inclement weather conditions, including snow and heavy rain, is a significant problem plaguing autonomous vehicle systems. Our work takes a step towards tackling this problem by leveraging existing hardware commonly used in self-driving vehicles, low-cost millimetre wave (mmWave) radar systems, to detect common corner retro-reflectors on roads. This paper presents the results of preliminary experiments that indicated, the generated path has an average error of 0.19 m ± 0.12 m with full fog coverage of retro-reflectors. Additional testing proves that, during the same full fog coverage a lidar sensor is unable to locate the reflectors. Our work is the first of its kind, using only mmWave radar to create and plan a path through a visually obstructed environment.
Brando (Big Rando): A low-cost, energy efficient bipedal robot for human-robot interaction during gait
Legged intelligent robots, such as bipedal robots, can substantially impact human health, social well-being, and mobility by acting as robot companions. Several prototypes have been built and proposed by researchers to navigate human-built and unstructured terrain. Although technology in designing robots is advancing and new hardware with high computational capability has been introduced, few bipedal robots have been built for safe interaction with humans. Our research aims to develop a lightweight, energy-efficient walking robot Brando (Big Rando) that can adjust walking through modulating step length. The robot design and controller are drawn from the human gait principles of minimum metabolic energy per distance and simple inverted pendulum models to provide reliable gait patterns and safety features that enable the robot to be used for human-robot walking experiments.
The Modular Off-Highway Robotic Equipment (MORE) System: A Proof of Concept Prototype.
In industries like construction, forestry, and mining, the need for human operators aboard off-highway equipment is being eliminated. This is commonly achieved by retrofitting the existing equipment with sensors and computers. However, it is believed that optimal performance of off-highway equipment is only achievable after a complete design overhaul given many constraints which bounded existing equipment designs are now obsolete. In an effort to explore the less constrained design space and enable potential functional improvements to such equipment, a proof of concept prototype, referred to as the Modular Off-Highway Robotic Equipment (MORE) System, was developed. The MORE System consists of modular units which can configure into various shapes to perform basic loading, hauling, and lifting tasks common of off-highway equipment. Such functionality is beyond the capabilities of conventional equipment which are typically restricted to a single shape and task. This poster describes the mechanical elements of the MORE System, presents preliminary demonstrations of its performance which validate the mechanical design, and highlights the relevant future efforts.
Towards Efficient Learning-Based Model Predictive Control via Feedback Linearization and Gaussian Process Regression
Tracking control of robotic systems form the foundation for automation solutions. Robotic systems exhibit nonlinear dynamics which complicate controller design, and real-time systems require computationally tractable solutions. The research presented herein introduces a learning-based Model Predictive Control (MPC) method which leverages nonlinear predictions and linear optimization. In particular, feedback linearization is employed for efficient optimization, and Gaussian Process Regression (GPR) is used to model unknown system dynamics. The controller has been verified on a two-link planar manipulator in simulation, in the presence of model uncertainty. The controller was shown to outperform a Proportional-Derivative Inverse Dynamics (PD-ID) controller, and a GPR augmented variation of said PD-ID. The proposed controller was found to track with only slightly worse accuracy than the fully nonlinear counterpart, but managed to calculate control inputs 82 times faster.
Remote Inspection of Bridge Infrastructure using Unmanned Aerial Vehicles (UAVs)
Aging and deteriorating infrastructure assets are one of the major challenges facing provincial governments and municipalities. Visual inspection of bridge infrastructure on a regular basis is critical to ensure that bridges remain in a safe operating condition. In Ontario, municipalities are required to complete visual inspections on a bi-annual basis to determine the physical and functional condition of bridges. Current inspection techniques are highly subjective and inaccurate due to formal training in bridge inspection, accessibility, and complexity. This project focuses on the development of an automated UAV-based bridge inspection system capable of providing a complete structural assessment. The system consists of: (1) autonomous UAV-based flight allowing for automated data acquisition to map the structure of interest; (2) creation of a 3D point-cloud and photorealistic model; (3) algorithm-based analysis for defect detection; and (4) defect location mapping to develop a Bridge Information Model (BrIM) to document conditions and deficiencies. This system will provide bridge owners the ability to accurately and reliably monitor infrastructure while providing improved documentation for subsequent inspections.
Walking in the Outdoors – Lessons to be Learned From Humans
Healthy humans can maintain stability in very complex environments in everyday life with ease, however, the strategies that are applied to allow us to do this are still unclear. In lab studies have largely concluded that corrective stepping is the main strategy used to overcome perturbations, but these experiments take place in ideal conditions with no obstacles. Out of the lab there are obstacles such as ice, snow, curbs, and other pedestrians that may limit the corrective stepping strategy, creating the need for another strategy to compensate. We plan to conduct out-of-lab experiments to determine how gait stability strategies change in the winter months with more obstacles present. We will collect kinematic and kinetic data using inertial measurement units and pressure sensitive insoles. Knowledge of the behaviour of humans will allow us to create bio-inspired controllers for robots and assistive exoskeletons that are better equipped for complex environments.
A New Fleet Management Approach for Autonomous Mining Vehicles Using Q-Learning
In modern mining practices, a new goal for the industry is a fully autonomous mine. Most original equipment manufacturers (OEMs) are working towards automating their equipment or parts of the machine’s functionality. Once all the machinery is automated, making the fleet all work together will be necessary to automate production fully. The cooperation of an autonomous fleet would increase production, worker safety, and overall efficiency. A centralized computer system, monitoring each piece of equipment’s individual data stream and system overall, can ultimately replace today’s dispatch rooms. The system automatically dispatches all vehicles based on measured conditions around the mine and adapt to new changes as they happen. These changes can include unplanned maintenance, stockpile backups, blending changes, production target changes and blast scheduling changes. The centralized computer system is controlled by an artificial intelligence (AI) algorithm that can optimize fleet dispatching by directing an individual machine to a new location based on its current internal measurements and the location of all other vehicles in the fleet. This type of system can be classified as a centralized heterogeneous robot swarm as all vehicles are working towards a common goal and controlled by a centralized computer, where each robot has a different specific task. The AI will optimize the scenario by altering vehicles’ path, changing loading and drop off points, and loading conditions to ensure the predetermined goal can be achieved. A goal must be chosen for the fleet based on each mine’s individual needs. Training this algorithm must be done for each installation and mine to suit individual needs and different fleet configurations. A scaled-down proof of concept was created to showcase the AI and fleet behaviors for certain scenarios to show how the fleet automatically makes schedule changes.
DIYing Inclusion: A Case Study of an ASD Intervention Using Personally Customizable Physical Computing Tools
For all students to feel part of a classroom community, it is imperative that the assistive tools we provide facilitate their inclusion. This is particularly true for students with autism spectrum disorder (ASD), whose difficulties with social communication and interaction mean that they often access their education in specialized settings. The equitable solution is to provide the same tools to all students and allow them to be adapted to suit individual needs. Building on research focused on wearable, customizable technologies, this study aims to highlight the utility of student-centred physical computing tools (e.g., Micro:bits) for inclusive education. Expected contributions include: (1) the creation of a codable physical computing platform that is customizable by school-age children; (2) the development of a feedback system that improves student independence; (3) the potential to use the tools at home; and (4) increased recognition of robotics as a means of creating inclusive classroom environments.
Feasibility of inertial sensors for measuring gait stability
Understanding how healthy humans adapt their gait while walking on real-world surfaces, not only in lab environments, is beneficial to preventing falls in susceptible individuals and improving the design of exoskeleton balance controllers. Inertial Measurement Units (IMUs) enable observation of gait adaptations on real-world surfaces because they are a portable alternative to optical motion tracking . However, the feasibility of using IMUs to measure gait stability is unclear. A study was therefore conducted to determine whether an IMU motion capture suit (Xsens MVN Link) can be used to accurately evaluate gait stability parameters of a walking individual. Lateral Margin of Stability (MoS), maximum Lyapunov exponent, stance times, and gait variability were simultaneously measured with an IMU suit and an optical motion capture system. Based on preliminary results, the IMU suit is likely capable of observing some effects between different walking conditions despite frequent medial bias in foot position.
Improved Standing Balancing For Legged Robots With Unified Balancing Model
To remain stable, legged robots require active balancing to compensate for unexpected disturbances. While standing, legged robots can utilize three discrete strategies: ankle, hip, and toe. However, existing reduced order models for legged robots at most only capture two of these strategies. We propose a unified reduced order model that includes all three standing balance strategies and compared push recovery simulations of the unified model against existing balancing models using a non-linear model predictive controller. We also developed a full body controller for a simple one-legged balancing robot that tracks controls of the reduced order models. For both the reduced order model and robot simulations, we found that the unified model could recover successfully from the largest disturbances, 3.7% (model) and 3.9% (robot) larger than the next best reduced order model. Our results suggest that successful implementation of a unified reduced order model on physical robots would enable a simplified controller to leverage all balancing strategies as needed for more robust balancing in legged robots.
Degree of Opacity Enforcement with Autonomous Vehicle Searches
An autonomous vehicles are sent into a known section of terrain to find a target whose exact location is unknown. The vehicles will communicate when they enter different regions of the terrain to coordinate the search. Unfortunately, an adversary is monitoring the communications. The adversary’s objective is to determine the location of the target within a given search time. The target must be obscured from adversarial observer while allowing the vehicles freedom to navigate and communicate. There is no set of secret states or strings since the target’s location is unknown. Traditional opacity formulations do not capture these problem characteristics. We introduce a more general measure of a system’s opacity called degree of opacity and apply it to this problem.
Developing Intelligent, Adaptive Simulation and Operational Support to Augment Trauma Response Readiness
Caring for a critically injured trauma patient is a complex and challenging task. Similar to other complex operational domains, an important characteristic of trauma team leaders is the development of high cognitive efficiency when faced with complex tasks during medical crises. Exceeding cognitive capacity has been shown to significantly degrade performance and learning outcomes. Effectively managing cognitive load, therefore, has profound implications for medical practice and education. Current Research: Funded by the IDEaS Program, DND, the overall objective of our research is to develop a real time, intelligently adaptive AR based simulations that dynamically tailors scenario complexity to match the learners’ level cognitive load. The platform combines real time, sensor based data acquisition with trained neural network based cognitive load classifiers. By enabling the autonomous machine level interpretation of previously “hidden” cognitive states, the demonstrated capabilities are a significant advancement in adaptive design, not only within the context of advanced augmented and virtual reality environments, but in support of cognitive performance enhancement across a broad range of complex operational environments.
Enhancing Descriptive Image Captioning with Natural Language Inference
Generating descriptive sentences that convey nontrivial, detailed, and salient information about images is an important goal of image captioning. In spite of the significant improvement of image captioning performance, existing models tend to play safe and generate generic captions. In the paper, we explore to develop better descriptive image captioning models from a novel perspective — considering that among different captions of an image, descriptive captions are more likely to entail less descriptive ones, we develop descriptive image captioning models that leverage natural language inference (NLI, or also known as recognizing textual entailment)
RCVPose: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting
We propose a novel keypoint voting scheme based on intersecting spheres, that is more accurate than existing schemes and allows for a smaller set of more disperse keypoints. The scheme forms the basis of the proposed RCVPose method for 6 DoF pose estimation of 3D objects in RGB-D data, which is particularly effective at handling occlusions. A CNN is trained to estimate the distance between the 3D point corresponding to the depth mode of each RGB pixel, and a set of 3 disperse keypoints defined in the object frame. At inference, a sphere of radius equal to this estimated distance is generated, centered at each 3D point. The surface of these spheres votes to increment a 3D accumulator space, the peaks of which indicate keypoint locations. The proposed radial voting scheme is more accurate than previous vector or offset schemes, and robust to disperse keypoints. Experiments demonstrate RCVPose to be highly accurate and competitive, achieving state-of-the-art results on LINEMOD 99.7%, YCB-Video 97.2% datasets, and notably scoring +7.9% higher than previous methods on the challenging Occlusion LINEMOD 71.1% dataset.
Siamese Capsule Network for End-to-End Speaker Recognition In The Wild
We propose an end-to-end deep model for speaker verification in the wild. Our model uses thin-ResNet for extracting speaker embeddings from utterances and a Siamese capsule network and dynamic routing as the Back-end to calculate a similarity score between the embeddings. We conduct a series of experiments and comparisons on our model to state-of-the-art solutions, showing that our model outperforms all the other models using substantially less amount of training data. We also perform additional experiments to study the impact of different speaker embeddings on the Siamese capsule network. We show that the best performance is achieved by using embeddings obtained directly from the feature aggregation module of the Front-end and passing them to higher capsules using dynamic routing.
A Semi-supervised EEG Learning Approach with Attention-based Recurrent Autoencoder
EEG-based emotion recognition often requires sufficient labeled training samples to build an effective computational model. Labeling EEG data, on the other hand, is often expensive and time-consuming. To tackle this problem and reduce the need for output labels in the context of EEG-based emotion recognition, we propose a semi-supervised pipeline to jointly exploit both unlabeled and labeled data for learning EEG representations. Our goal is to provide a powerful framework by leveraging abundant unlabeled samples, while minimally depending on labeled samples for training. Our semi-supervised framework consists of both unsupervised and supervised components. The unsupervised part maximizes the consistency between original and reconstructed input data using an autoencoder, while simultaneously the supervised part minimizes the cross-entropy between the input and output labels. We evaluate our framework using both a stacked autoencoder and an attention-based recurrent autoencoder. We test our framework on the large-scale SEED EEG dataset and compare our results to several other high-quality semi-supervised methods. Our semi-supervised framework with deep attention-based recurrent autoencoder consistently outperforms the benchmark methods, even when small sub-sets (3%, 5% and 10%) of the output labels are available during training, achieving a new state-of-the-art semi-supervised performance.
Toward Wearables of the Future: Affordable Acquisition of Continuous ECG with Deep Learning
Electrocardiogram (ECG) is the electrical measurement of cardiac activity, whereas Photoplethysmogram (PPG) is the optical measurement of volumetric changes in blood circulation. While both signals are used for heart rate monitoring, from a medical perspective, ECG is more useful as it carries additional cardiac information. However, there are no reliable solutions for continuous ECG monitoring in wrist-based wearable, feasible for everyday and pervasive use. We believe continuous wearable-based ECG could enable early diagnosis of cardiovascular diseases, and in turn, early preventative measures can be taken to overcome severe cardiac problems. In this study, our goal is to enable the use of ECG in wrist-based wearable devices such as smart watches, for continuous cardiac monitoring. We propose CardioGAN, an adversarial model which takes PPG as input and generates ECG as output. Our experiments show that the ECG generated by CardioGAN provides more reliable heart rate measurements compared to the original input PPG, reducing the HR estimation error from 9.74 beats per minute (measured from the PPG) to 2.89 (measured from the generated ECG).
Self-supervised Contrastive Learning of Multi-view Facial Expressions
Facial expression recognition (FER) has emerged as an important component of human-computer interaction systems. Despite recent advancements in FER, performance often drops significantly for non-frontal facial images. We propose Contrastive Learning of Multi-view facial Expressions (CL-MEx) to exploit facial images captured simultaneously from different angles towards FER. CL-MEx is a two-step training framework. In the first step, an encoder network is pre-trained with the proposed self-supervised contrastive loss, where it learns to generate view-invariant embeddings for different views of a subject. The model is then fine-tuned with labeled data in a supervised setting. We demonstrate the performance of the proposed method on two multi-view FER datasets, KDEF and DDCF, where state-of-the-art performances are achieved. Further experiments show the robustness of our method in dealing with challenging angles and reduced amounts of labeled data.
Human Activity Recognition with Self-supervised Learning of Wearable Data
We propose the use of self-supervised learning for human activity recognition with smartphone accelerometer data. Our proposed solution consists of two steps. First, the representations of unlabeled input signals are learned by training a deep convolutional neural network to predict a segment of accelerometer values. Our model exploits a novel scheme to leverage past and present motion in x and y dimensions, as well as past values of the z axis to predict values in the z dimension. This cross-dimensional prediction approach results in effective pretext training with which our model learns to extract strong representations. Next, we freeze the convolution blocks and transfer the weights to our downstream network aimed at human activity recognition. For this task, we add a number of fully connected layers to the end of the frozen network and train the added layers with labeled accelerometer signals to learn to classify human activities. We evaluate the performance of our method on three publicly available human activity datasets: UCI HAR, MotionSense, and HAPT.
A Transformer Architecture for Stress Detection from ECG
Affective computing studies show how machines can recognize and infer human emotions. In this work, we propose an architecture based on convolutional and transformer architectures that uses ECG to detect stress. Our model uses only two convolutional blocks, which is considerably less compared to other works in the area. Out proposed model is tested on two publicly available datasets WESAD and SWELL-KW, using leave-one-subject-out (LOSO) criteria.
When Reinforcement Learning Generates Face Trees
We propose an end-to-end architecture for facial expression recognition. Our model learns an optimal tree topology for facial landmarks, whose traversal generates a sequence from which we obtain an embedding to feed a sequential learner. To learn this optimal tree topology, we have adopted reinforcement learning. The proposed architecture incorporates two main streams, one focusing on landmark positions to learn the structure of the face, while the other focuses on patches around the landmarks to learn texture information. Each stream is followed by an attention mechanism and the outputs are fed to a two-stream fusion component to perform the final classification. We conduct extensive experiments on two large-scale publicly available facial expression datasets, AffectNet and FER2013, to evaluate the efficacy of our approach. The preliminary results we obtained show the efficacy of our proposed method.
Estimating Pose from Pressure Data for Smart Beds with Deep Image-based Pose Estimators
In-bed pose estimation has shown value in fields such as hospital patient monitoring, sleep studies, and smart homes. In this paper, we explore different strategies for detecting body pose from highly ambiguous pressure data, with the aid of pre-existing pose estimators. We examine the performance of pre-trained pose estimators by using them either directly or by re-training them on two pressure datasets. We also explore other strategies utilizing a learnable pre-processing domain adaptation step, which transforms the vague pressure maps to a representation closer to the expected input space of common purpose pose estimation modules. Accordingly, we used a fully convolutional network with multiple scales to provide the pose-specific characteristics of the pressure maps to the pre-trained pose estimation module. Our complete analysis of different approaches shows that the combination of the learnable pre-processing module along with re-training pre-existing image-based pose estimators on the pressure data is able to overcome issues such as highly vague pressure points to achieve very high pose estimation accuracy.