L3 91908 Topic

Computer Vision

Computer vision enables computers to see, analyse, and understand images or videos, mimicking human sight using software and hardware.

Key Concepts:

Image recognition: Identifying objects in photos or videos.
Facial recognition, motion tracking, depth sensing.
Machine learning / deep learning: Systems learn to interpret visual data from training datasets.

Algorithms/Mechanisms:

Convolutional Neural Networks (CNNs): Used in image recognition.
Edge detection, feature extraction, object tracking.
Segmentation: Dividing an image into useful parts.

Examples of Use:

Self-driving cars (detecting pedestrians, road signs)
Face unlock on phones
Surveillance and security systems
Medical imaging (e.g., detecting tumors)

Key Issues:

Accuracy and bias—vision systems may fail on non-standard inputs or be biased.
Privacy—surveillance tech raises ethical concerns.
Dataset quality—garbage in, garbage out.

Human Impact & Perspectives:

Can assist and enhance human capability (e.g., in medicine or accessibility).
Concerns about loss of privacy, over-reliance, or misuse (e.g., tracking people).
Different views: Tech companies see efficiency; critics focus on ethics and fairness.

RESOURCES FOR L3 COMPUTER VISION

Teacher Help

Dr Tim Bell ran a webinar at the start of 2025 to help teachers understand Computer. The recording of this can be found on Youtube L3 External - Computer Vision Webinar

This online webinar looks at what the general topic is about, and some specific learning activities that you can use with your students to engage them with the key concepts in the topic.

Task 1 - Understand Computer Vision

First you need to understand what Computer Vision entails.

The GOAT: The CSField Guide: Computer Vision Watch the video and read through the material. If you learn this stuff inside out and you'll smash it.

The problem with Digital imagery in computing

Digital cameras, while essential for computer vision, present several challenges for computer science applications. These include issues like noise, varying lighting conditions, perspective and scale changes, occlusion, and the need for large amounts of labeled data. Additionally, real-time processing of complex tasks and ensuring model interpretability are ongoing concerns.

1. Noise: Digital cameras can introduce noise into images due to electronic interference, particularly in low-light conditions. This noise can make it harder for computer vision algorithms to accurately identify objects and features.

2. Lighting Variations: Fluctuations in lighting can significantly impact image quality and make it difficult for algorithms to generalize across different lighting conditions.

3. Perspective and Scale Changes: Objects in images can appear at different sizes and angles, making it challenging for algorithms to recognise them consistently.

4. Occlusion: Objects can be partially hidden or blocked by other objects, making it harder for computer vision systems to identify the occluded parts.

5. Lens Distortion and Vignetting: Digital cameras can introduce lens distortions (curved lines) and vignetting (darker corners) that need to be corrected for accurate analysis.

The other problem with Digital imagery in computing

1. Data Requirements: Computer vision models, especially those based on deep learning, require large amounts of labeled data for training. Acquiring and labeling this data can be time-consuming and expensive.

2. Real-time Processing: Complex tasks in computer vision often require significant computational resources, making it challenging to achieve real-time performance, especially on resource-constrained devices.

Why this is a computer science problem

Addressing these challenges requires ongoing research and development in areas like robust algorithms, advanced sensor technology, and efficient processing techniques.

Computer vision is a major area of computer science because it bridges the gap between how humans perceive the world and how computers interpret visual data.

Real World Application

Real-World Application and Impact

Computer vision enables machines to "see" and interpret visual information, making it essential in fields such as:

Healthcare – medical image analysis, cancer detection, surgical assistance
Security – facial recognition, surveillance, biometric verification
Automotive – autonomous vehicles, lane detection, pedestrian recognition
Retail – inventory tracking, self-checkout systems, customer behaviour analysis
Agriculture – crop monitoring, disease detection
Manufacturing – defect detection, automation, quality control

AI and Machine Learning

AI and Machine Learning Integration

Computer vision is tightly integrated with AI and machine learning. It allows systems to learn from visual data, adapt to new environments, and improve performance over time. Techniques like deep learning (e.g. convolutional neural networks) have revolutionised how visual tasks are approached.

Interdisciplines

Complex Interdisciplinary Challenge

Computer vision combines:

Mathematics and Statistics – for modelling image features and transformations
Algorithms and Data Structures – for efficient processing
Human Vision and Psychology – to replicate perception and recognition
Hardware Engineering – for image sensors and optimised processing

This makes it a rich research domain that advances multiple areas of computer science simultaneously.

Data Volume

Huge Volumes of Visual Data

With the explosion of digital images and video (e.g. social media, smartphones, CCTV), there’s an increasing need for systems that can analyse, sort, and act on this data without human intervention.

Emerging Technologies

Foundational to Emerging Technologies

Computer vision underpins major emerging technologies:

Augmented Reality (AR) and Virtual Reality (VR)
Robotics and Drones
Smart cities and IoT
Human-computer interaction (e.g. gesture recognition)

Task 3 - Convolutional Kernals

A visual explanation of how convolution works.

Brandon Rohrer https://www.youtube.com/watch?v=B-M5q51U8SM

Understanding Convolution – CNN Fundamentals

Timestamped Contents

0:01 What is a convolution and why it powers AI

0:20 Overview of the image (14×14 grid) and kernel

0:55 Reversing the kernel (top-bottom, left-right flip or 180° rotation)

1:17 Visualising values with colours

How Convolution Works

1:26 Definition of convolution operation

2:06 Element-by-element example 1 (simple 1×1 activation)

3:14 Element-by-element example 2 (multiple active pixels)

4:18 Element-by-element example 3 (producing negative value)

5:36 Summary – Dot product and convolution result

Feature Detection through Kernels

6:11 Example: diagonal gradient kernel detecting diagonals in an “O” shape

7:02 Vertical kernel detecting vertical features

8:03 Convolving with an "X" image – detecting only right-leaning strokes

Replication and Alternate Interpretation

9:07 Convolution as replication – reproducing kernels at non-zero positions

10:05 Demonstrating convolution with single-pixel images

11:29 Arbitrary kernel sizes and shapes

12:13 Why convolution result is smaller than the original image

13:08 Padding – techniques to retain image size after convolution

Kernel Applications: Blur, Sharpen, Feature Detection

14:32 Blurring kernel – diffusion of brightness into neighbours

15:40 5×5 blur kernel applied to natural image

16:20 Diagonal gradient kernel on dog image – feature extraction

17:24 Multiple kernels detecting different features (horizontal, diagonal)

18:01 Sharpening kernel – enhancing contrast between central and neighbouring pixels

Task 4 - Computer Vision in action.

Now you have learnt how a convolutional Kernal is used to detect and process an image and lines. Use this example to see how it works. This is a new interactive visualization of neural networks trained on handwritten digit recognition, with the intent of showing the actual behavior of the network given user-provided input. The user can interact with the network through a drawing pad, and watch the activation patterns of the network respond in real-time. The visualization is available at http://scs.ryerson.ca/ ∼aharley/vis/.

https://adamharley.com/nn_vis/harley_vis_isvc15.pdf

Task 4 - Random forests and decision trees

Random forests and decision trees are both used in computer vision, but they serve different purposes and have different strengths. Decision trees are simple, interpretable models that can be used for tasks like image classification and object detection. Random forests, on the other hand, are ensemble methods that combine multiple decision trees to improve accuracy and robustness. They are particularly useful when dealing with complex datasets and high-dimensional data, like those often encountered in computer vision.

Decision Trees Explained:

A decision tree consists of nodes, branches, and leaf nodes.

Nodes: Represent questions or decisions based on input features.
Branches: Represent the outcomes or possible values of a decision.
Leaf Nodes: Represent the final prediction or classification based on the path taken through the tree.

https://medium.com/data-science/understanding-decision-trees-for-classification-python-9663d683c952

Decision Trees in Computer Vision:

Decision trees are a valuable tool in both machine learning and computer vision, offering a structured approach to analyzing visual data and making decisions. They are particularly useful for image classification and object detection, breaking down complex visual information into simpler, understandable steps.

Simplicity and Interoperability:
Decision trees are easy to understand and visualize, making them useful for tasks where interoperability is important.
Feature Importance:
They can provide insights into which image features are most important for a given task.
Limited Complexity:
Decision trees can struggle with complex, high-dimensional data like those found in many computer vision problems.

https://www.geeksforgeeks.org/dsa/random-forest-classifier-using-scikit-learn/

Random Forests in Computer Vision:

Improved Accuracy:
Random forests combine multiple decision trees, which can lead to more accurate predictions than single decision trees.
Robustness:
They are more robust to noise and overfitting than single decision trees.
Feature Importance:
Like decision trees, random forests can also provide feature importance information.
Computational Cost:
Training and using random forests can be more computationally expensive than single decision trees, especially with large datasets.
Ensemble Learning:
Random forests utilize ensemble learning by combining multiple decision trees, where each tree is trained on a different random subset of the data and features. This bagging technique helps reduce overfitting and improve generalization.

Task 5 - Ethics in AI and Computer Vision

Ethical Considerations in AI – TechG

0:05 Introduction – What is AI ethics and why it matters

0:32 The core question: “Just because we can build it, should we?”

0:49 Ethics in real-world AI decisions (hiring, crime prediction, loans)

Four Key Ethical Pillars in AI

1:07 Overview: Bias, Fairness, Transparency, Accountability

1:55 Bias in AI – Sources, real-world examples, and consequences

2:25 Example: Facial recognition underperforming on dark-skinned faces

2:55 Example: Hiring algorithms learning bias from historical data

3:08 Example: Predictive policing and feedback loops

3:17 Example: Healthcare tools underrepresenting women and minorities

3:24 Causes of bias: training data, lack of team diversity, poor auditing

Fairness in AI

3:57 What does fairness mean in AI?

4:33 Equality vs Equity explained

4:41 Types of fairness: Demographic parity, Equal opportunity, Individual fairness

5:07 Accuracy ≠ fairness: group differences in performance

5:21 Fairness is a values problem, not just a math problem

Transparency in AI

The “Black Box” problem in AI

6:10 What does transparency look like?

6:30 Explainable AI (XAI): clarity in how decisions are made

6:56 Trade-off between model accuracy and interpretability

Accountability in AI

7:00 Why accountability matters (e.g. AI-caused car accident)

7:23 Legal responsibility in AI outcomes

7:36 International examples: The EU AI Act

8:00 How to build accountability: audit trails, human-in-the-loop, and clear ownership

8:21 Summary: Accountability = humans remain in charge of AI outcomes

Ethical AI in Practice

8:29 Ethical AI needs to be designed in from the start, not patched in later

8:56 What can individuals do? Ask questions, demand transparency, support diversity, stay informed

9:47 Wrapping up: AI ≠ conscious — humans are, so values must guide AI use

DERIVED GRADE EXAM

Teacher Note

DTTA will provide one at the start of Term 3. This will be advertised on the DTTA Mobilse forum.

This the DTTA Derived Grade Exam Resources for 91908 provided in 2024

Task 5 - Sit a Derived Grade Exam

Your teacher will provide this. Do your best and remember to give specific examples!

Good luck for the exam!

Page updated

Google Sites

Report abuse