Web API

Introduction

The Waygo API allows you to extract text from images, also known as OCR, in a way that is automatic and scalable. The API currently supports text extraction of Chinese, English, Japanese and Korean. The detect endpoint returns the position, color and background color of detected text. If you have any questions about the Waygo API or the documentation, please reach out to sdk@waygoapp.com.

Authentication

To authorize, use this code:

import waygo

api = waygo.authorize('yourapikey')

# With shell, you can just pass the correct header with each request
curl "api_endpoint_here"
  -H "Authorization: yourapikey"

Make sure to replace yourapikey with your API key.

Waygo uses API keys to allow access to the API. You can request a new Waygo API key by contacting us.

Waygo expects for the API key to be included in all API requests to the server in a header that looks like the following:

Authorization: yourapikey

You must replace yourapikey with your personal API key.

Image

Detect text

import waygo

api = waygo.authorize('yourapikey')
img = open("path/to/image.jpg", "r")
api.image.detect(image=img, lc_src="zh")

curl \
  --request POST \
  --url http://waygoapp.com/api/v1/image/detect \
  --header 'cache-control: no-cache' \
  --header 'content-type: multipart/form-data' \
  --header 'Authorization: yourapikey' \
  --form 'image=@/path/to/image.jpg' \
  --form lcSrc=zh \
  --form type=default

The above command returns JSON structured like this:

[
  {
    "value": "漢堡",
    "translation": "Hamburger",
    "romanization": "hàn bǎo",
    "shape": [
      {"x": 10, "y": 40}, 
      {"x": 100, "y": 40}, 
      {"x": 100, "y": 80},
      {"x": 10, "y": 80}
    ],
    "colors": {
      "fg": [0, 4, 6],
      "bg": [240, 245, 255]
    },
    "score": {
      "recognition": 1.0,
      "translation": 1.0
    }
  },
  {
    "value": "雞香堡",
    "translation": "Chicken Burger",
    "romanization": "jī xiāng bǎo",
    "shape": [
      {"x": 10, "y": 140}, 
      {"x": 130, "y": 140}, 
      {"x": 130, "y": 180},
      {"x": 10, "y": 180}
    ],
    "colors": {
      "fg": [0, 4, 6],
      "bg": [240, 245, 255]
    },
    "score": {
      "recognition": 1.0,
      "translation": 1.0
    }
  }
]

This endpoint detects the lines of text in an image, and returns the positions, colors and text of the detected text. It will also return an English translation and an English romanization of the text, if the source language is not English. If the source language is English, the translation and romanization fields will currently mirror the value field.

The available form parameters for this request are:

image (required) is the image to be used for detection, attached as part of a multipart-encoded request.
lcSrc (required) is short for “language code, source”, and represents the language code of the language that should be detected in the image. See the POST Form Parameters section for available language codes.
lcTgt (optional) is short for “language code, target”, and represents the language that should be translated into, if any. This defaults to English. See the POST Form Parameters section for available language codes.
type (optional) is a hint for what kind of text should be expected in this image. Different Waygo language and OCR models are optimized for different use cases, and when applicable, this hint can boost performance in certain situations. The available types right now are default and receipt. Use receipt when it is known that the source image contains a receipt, invoice or similar document. Use default in all other cases.

A list of detected labels are returned. Every detected label in the list contains the following fields:

value is the detected text in the language specified by lcSrc
translation is the translation of the detected text into English
romanization contains a pronounceable version of the detected text. The precise romanization depends on the language. For Chinese, the romanization is pinyin, while for Japanese it is a version of romaji.
shape contains a list of points for the shape that fits around the detected text. In most cases, this will be four coordinates representing the four corners of a rectangle. The coordinates are given in clockwise order.
colors contains two keys, fg and bg, short for foreground and background. Foreground is the color of the detect text, while background is the average color behind the text. Both keys contain a list with three integer values between 0 and 255, representing the RGB color value. For example, "fg": [0, 34, 230] means that the text color has a red channel with value 0, green 34, and blue 230, or expressed in CSS, rgba(0, 34, 230, 1.0).is
score contains two keys, recognition and translation. Both are values between 0 and 1, with a score of 0 representing very low certainty in the result, and a score of 1 representing the highest certainty of a good result. These values are approximations of how accurate the results are, and can be used to selectively show or filter certain results.

HTTP Request

POST http://waygoapp.com/api/v1/image/detect

POST Form Parameters

Parameter	Required	Default	Description
image	Yes	-	The image to be used for detection, attached as part of a multipart-encoded request.
lcSrc	Yes	-	The ISO language code of the source language, or in other words, the language of the text in the image. The accepted language codes are: `en` (English) `zh` (Chinese) `ja` (Japanese) `ko` (Korean) If `zh` is specified, the API will automatically handle both simplified and traditional Chinese text.
lcTgt	No	en	The ISO language code of the source language, or in other words, the language to translate to. Currently, the only valid option is `en` (English), and the parameter is not required.
type	No	default	A hint for what kind of text should be expected in this image. Use `receipt` when it is known that the source image contains a receipt, invoice or similar document, and `default` otherwise.

Errors

The standard error response is structured like this:

{
    "code": 418,
    "message": "error message",
    "fields": ["field1", "field2"]
}

The fields field will only be set if there is an error related to a specific field in the request, otherwise it may be left out. The message field will give a human-readable description of the error.

The error code will match the HTTP response code, and possible codes are:

Error Code	Meaning
400	Bad Request
401	Unauthorized
403	Forbidden
404	Not Found
429	Too Many Requests
500	Internal Server Error
503	Service Unavailable

iOS SDK

1. Add Waygo SDK to Project

If you are creating a new iOS app, open Xcode and select File > New > Project and select a Single View Application and name it whatever you want. For the purposes of this tutorial, we have named ours WaygoSample. Make sure that your project Language is set to Objective-C. This tutorial is designed for an iPhone so we have set our Devices to iPhone.

Step 1

Unzip WaygoSDK_iOS.zip.

Step 2

Create a new group (folder) in your project and name it WaygoSDK. Right click on this folder and select Add Files to “WaygoSample”. In the window that follows, navigate to your WaygoSDK_iOS folder and open the SDK folder. Select all files in this folder (Waygo.framework, opencv2.framework, Waygo.bundle, Waygo.license, public.pem, sign.bin). Select Copy items if needed so that these resources will be copied to your project folder.

Step 3

Step 4

Verify that the files have been added correctly by going to the Build Phases tab of your app target settings and expanding the Link Binary With Libraries and Copy Bundle Resources sections.

Step 5

2. Add Additional Dependencies

The Waygo SDK requires 2 standard libraries to run properly. The libraries are libc++ and libz. Before leaving the previous screen (your app target Build Phases), click on the + at the bottom of the Link Binary With Libraries section.

Step 6

Select libc++.tbd and libz.tbd from the list and click Add.

Step 7

Step 8

3. Configure Project Settings

The Waygo SDK is built for the standard mobile architectures (armv7, armv7s and arm64). Your project must build for at least one of these architectures and cannot be built for anything other than these three (Waygo does not run on armv6, or the simulator). Set your project Architectures accordingly.

Step 9

Make sure you are using the default Apple compiler, in this case Apple LLVM 7.1.

Step 10

Next, make sure that your project folder $(SRCROOT) is included in your Framework Search Paths.

Step 11

The Waygo SDK does not make use of Bitcode so set it to NO in your Project Build Settings.

Step 12

The Waygo SDK is built with C++. Your C++ compiler settings much match that of the Waygo SDK. Set your C++ Standard Library to libstdc++ (GNU C++ standard library).

Step 13

Finally, because the Waygo SDK uses C++, you must modify your app’s main.m file extension to main.mm which will enable your app to run C++ code.

Step 14

4. Initializing the Camera and Waygo

Conform to Camera Protocol

Go to the header file for the view controller (ViewController.h) in which you want to run Waygo from. You need to import <Waygo/Waygo.h> and make your view controller conform to the AVCaptureVideoDataOutputSampleBufferDelegate protocol.

#import <UIKit/UIKit.h>
#import <Waygo/Waygo.h>

@interface ViewController : UIViewController <AVCaptureVideoDataOutputSampleBufferDelegate>

@end

Create Camera and Waygo Properties

Go to the class file for the view controller (ViewController.m) in which you want to run Waygo from and create some properties for the camera and Waygo SDK within the @interface of your class. You will need the following objects:
Waygo *waygoSDK (SDK object)
AVCaptureSession *captureSession (Camera capture session)
AVCaptureVideoPreviewLayer *previewLayer (Camera view display on screen)
CGRect targetBox (region of preview for cropping and processing in Waygo)
UIImageView *targetBoxView (visual boundary of single line target box)

#import "ViewController.h"

@interface ViewController ()

// Waygo SDK object
@property (strong, nonatomic) Waygo *waygoSDK;
// Camera objects
@property (strong, nonatomic) AVCaptureSession *captureSession;
@property (strong, nonatomic) AVCaptureVideoPreviewLayer *previewLayer;
// Target box (the region of screen to fit text into and crop for processing)
@property (nonatomic) CGRect targetBox;
@property (weak, nonatomic) IBOutlet UIImageView *targetBoxView;

@end

Initialize the Waygo SDK

You will now initialize the Waygo SDK, set the language pair you wish to translate, and build the target box. In the viewDidLoad method of your view controller, enter the code shown in the screenshot. During the Waygo initialization, your license will be verified. If your license verification fails, the waygoSDK object will be nil in which case you should not proceed with further setup. See the Waygo.h file for all possible language codes.

- (void)viewDidLoad
{
    [super viewDidLoad];

    // Initialize Waygo SDK object
    self.waygoSDK = [[Waygo alloc] init];
    // self.waygoSDK will be nil if license verification fails
    // check console for verification status
    if (!self.waygoSDK)
        return;
    // Set language pairing code, i.e. CHtoEN for Chinese to English
    [self.waygoSDK setLanguageCode:CHtoEN];
}

Initialize the AVCaptureSession

After building your target box in viewDidLoad, you will now initialize your captureSession. The Waygo SDK will take care of most of the set up. You must then set up your camera previewLayer which displays the live feed that the device camera sees. The code below sets the preview layer to take up the full device screen. Enter the code shown below in viewDidLoad after initializing the Waygo SDK.

/*We create a capture session*/
self.captureSession = [[AVCaptureSession alloc] init];
self.captureSession = [self.waygoSDK setCaptureSessionForViewController:self];

/*We display the camera preview layer*/
self.previewLayer = [AVCaptureVideoPreviewLayer
                     layerWithSession:self.captureSession];
self.previewLayer.frame = self.view.frame;
self.previewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill;
[self.view.layer insertSublayer:self.previewLayer atIndex:0];

Implement the AVCaptureSession Delegate

As each new frame of video is captured by your app, the image will be sent to the captureOutput:didOutputSampleBuffer:fromConnection delegate method. This method is part of the AVCaptureVideoDataOutputSampleBufferDelegate protocol that we set up in our ViewController.h file. Add the method below to your ViewController.m. We skip processing if the camera is currently adjusting focus. We send the imageBuffer and targetBox to the SDK where the image will be cropped using the targetBox frame and translated. The SDK will return a NSArray of WaygoTranslation objects, 1 for each line of text that was translated.

/* This is the camera delegate. As new frames come in, this method is called
and the image buffer is sent to the Waygo SDK for processing. */
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer 
       fromConnection:(AVCaptureConnection *)connection
{
    // We create an autorelease pool because as we are not in the main_queue
    // our code is not executed in the main thread. So we have to create an
    // autorelease pool for the thread we are in
    @autoreleasepool {
        // get device handle
        AVCaptureDevice *device = [AVCaptureDevice
                                   defaultDeviceWithMediaType:AVMediaTypeVideo];

        // ignore frames that arrive while the device is refocusing
        if ([device isAdjustingFocus])
        {
            return;
        }

        // Get full-size buffer from camera and send to Waygo
        CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
        NSArray *waygoResults = [[NSArray alloc] init];
        waygoResults = [self.waygoSDK translateImageBuffer:imageBuffer
                                                fromDevice:device
                                                 withinBox:self.targetBox];

        //We unlock the image buffer
        CVPixelBufferUnlockBaseAddress(imageBuffer,0);
    }
}

Now go back to your viewDidLoad method and start the captureSession. This will allow the delegate method you just created to be called by captureSession.

/* If captureSession is not nil, start the session. The captureSession will be nil
if the app does not have permission to access the camera. In this event, you must
notify the user to enable the app to have camera access. */
if (self.captureSession)
    [self.captureSession startRunning];

We need to show the targetBox on the screen to give the user a visual target for where to center the text. Let’s add the target-box.png image from the WaygoSample project folder to our project.

Step 15

Step 16

With the target-box image added to our project, let’s assign the image to our targetBoxView, then set the targetBox rect equal to the targetBoxView frame. Add this to the bottom of your viewDidLoad method.

// Add Target Box Image - this is the region of the camera preview that is
// cropped and sent to the Waygo SDK for processing.
UIImage *targetBoxImage = [[UIImage imageNamed:@"target-box"]
resizableImageWithCapInsets:UIEdgeInsetsMake(10, 10, 10, 10)];
self.targetBoxView.image = targetBoxImage;

Set the _targetBox rect to match the UIImageView frame:

_targetBox.origin.x = _targetBoxView.center.x - _targetBoxView.bounds.size.width/2;
_targetBox.origin.y = _targetBoxView.center.y - _targetBoxView.bounds.size.height/2;
_targetBox.size.width = _targetBoxView.bounds.size.width;
_targetBox.size.height = _targetBoxView.bounds.size.height;
[self.waygoSDK updateCaptureSession:self.captureSession
                    withOrientation:UIInterfaceOrientationPortrait];

The core functionality of the sample project is now complete. You can run the app on an iOS device to check that it builds correctly and you see the camera preview with target box visible on the screen.

Step 17

Display Results on the Screen

The app doesn’t appear to be doing anything interesting right now. We need to add some text labels to the screen to show the user the translations. Let’s go back to the @interface of our ViewController.m and add some UILabel IBOutlets to show the recognized text, the romanized pronunciation, and the translation.

@property (weak, nonatomic) IBOutlet UIView *textDisplayView;
@property (weak, nonatomic) IBOutlet UILabel *textLabel;
@property (weak, nonatomic) IBOutlet UILabel *romanLabel;
@property (weak, nonatomic) IBOutlet UILabel *translationLabel;

Now we need to extract the WaygoTranslation object from the array returned by the waygoSDK and display this information on the screen using our labels we just created. Modify the captureOutput:didOutputSampleBuffer:fromConnection delegate method to retrieve the WaygoTranslation object from the returned array.

if (waygoResults.count > 0) {
    if (self.waygoSDK.isSingleLine) { // single line, display the new results
        WaygoTranslation *waygoTranslation = waygoResults[0];
        /* We have to push the UI updates to the main thread because
        we are currently in the camera thread */
        dispatch_async(dispatch_get_main_queue(), ^{
            [self displayTranslation:waygoTranslation];
        });
    }
}

Now let’s implement the displayTranslation method we are calling from the camera delegate method. This method will take in the WaygoTranslation object and set the text of our three labels. See the WaygoTranslation.h file for a description of each property.

- (void)displayTranslation:(WaygoTranslation *)results
{
    [self.textLabel setText:results.recognizedText];
    [self.romanLabel setText:results.romanization];
    [self.translationLabel setText:results.translation];
}

Finally, let’s add the Waygo watermark to the bottom right corner of the screen. Add an IBOutlet for UIImageView *watermarkView and then attach the image using this code at the end of your viewDidLoad method.

UIImageView *watermarkView = [self.waygoSDK getWaygoWatermark];
self.waygoWatermark.image = watermarkView.image;
self.waygoWatermark.alpha = watermarkView.alpha;

That’s it! Your project is now ready to translate single line text in real time using the Waygo SDK. Take a look at the included WaygoSample project for some additional features such as switching to Multi Line mode, pausing a translation, implementing a torch, and handling landscape. Run your project and try translating some phrases!

Step 18

Android SDK

The Waygo Android SDK documentation for offline mobile OCR is not currently available online, but if you are interested in using our Android SDK, please contact us for the SDK key and documentation.