iOS基于Vision框架的人体关键点检测

Posted on 2023-03-04 Edited on 2023-05-09

从iOS 14和macOS 11开始，Vision增加了识别人体姿势的强大功能。他可以识别人体的19个关键点。如图所示：
人体关键点.png

实现

1.发起一个请求

使用Vision框架，通过VNDetectHumanBodyPoseRequest提供身体姿势检测功能。下面代码演示了，如何从CGImage中检测身体关键点。

// Get the CGImage on which to perform requests.
guard let cgImage = UIImage(named: "bodypose")?.cgImage else { return }

// Create a new image-request handler.
let requestHandler = VNImageRequestHandler(cgImage: cgImage)

// Create a new request to recognize a human body pose.
let request = VNDetectHumanBodyPoseRequest(completionHandler: bodyPoseHandler)

do {
    // Perform the body pose-detection request.
    try requestHandler.perform([request])
} catch {
    print("Unable to perform the request: \(error).")
}

2.处理结果

请求处理完之后，会调用完成的闭包，通过闭包，可以获取到检测结果和错误信息。如果正常检测到人体关键点，将以VNHumanBodyPoseObservation数组的形式返回。VNHumanBodyPoseObservation中包含识别到的关键点和一个置信度分数，置信度越大，说明识别的精度越高。

func bodyPoseHandler(request: VNRequest, error: Error?) {
    guard let observations =
            request.results as? [VNHumanBodyPoseObservation] else { 
        return 
    }
    
    // Process each observation to find the recognized body pose points.
    observations.forEach { processObservation($0) }
}

3.获取关键点

可以通过VNHumanBodyPoseObservation.JointName来获取对应的关键点的坐标。注意，recognizedPoints(_:) 方法返回的点取值范围[0, 1]，原点位于左下角，实际使用中需要进行转换。

func processObservation(_ observation: VNHumanBodyPoseObservation) {
    
    // Retrieve all torso points.
    guard let recognizedPoints =
            try? observation.recognizedPoints(.torso) else { return }
    
    // Torso joint names in a clockwise ordering.
    let torsoJointNames: [VNHumanBodyPoseObservation.JointName] = [
        .neck,
        .rightShoulder,
        .rightHip,
        .root,
        .leftHip,
        .leftShoulder
    ]
    
    // Retrieve the CGPoints containing the normalized X and Y coordinates.
    let imagePoints: [CGPoint] = torsoJointNames.compactMap {
        guard let point = recognizedPoints[$0], point.confidence > 0 else { return nil }
        
        // Translate the point from normalized-coordinates to image coordinates.
        return VNImagePointForNormalizedPoint(point.location,
                                              Int(imageSize.width),
                                              Int(imageSize.height))
    }
    
    // Draw the points onscreen.
    draw(points: imagePoints)
}

拓展

除了使用Vision的VNDetectHumanBodyPoseRequest来做人体姿势识别之外，还可以使用CoreML来实现。官方示例：Detecting Human Body Poses in an Image ，此示例可以运行在iOS 13及以上版本。