Web API
Introduction
The Waygo API allows you to extract text from images, also known as OCR, in a way that is automatic and scalable. The API currently supports text extraction of Chinese, English, Japanese and Korean. The detect endpoint returns the position, color and background color of detected text. If you have any questions about the Waygo API or the documentation, please reach out to sdk@waygoapp.com.
Authentication
To authorize, use this code:
import waygo
api = waygo.authorize('yourapikey')
# With shell, you can just pass the correct header with each request
curl "api_endpoint_here"
-H "Authorization: yourapikey"
Make sure to replace
yourapikey
with your API key.
Waygo uses API keys to allow access to the API. You can request a new Waygo API key by contacting us.
Waygo expects for the API key to be included in all API requests to the server in a header that looks like the following:
Authorization: yourapikey
Image
Detect text
import waygo
api = waygo.authorize('yourapikey')
img = open("path/to/image.jpg", "r")
api.image.detect(image=img, lc_src="zh")
curl \
--request POST \
--url http://waygoapp.com/api/v1/image/detect \
--header 'cache-control: no-cache' \
--header 'content-type: multipart/form-data' \
--header 'Authorization: yourapikey' \
--form 'image=@/path/to/image.jpg' \
--form lcSrc=zh \
--form type=default
The above command returns JSON structured like this:
[
{
"value": "漢堡",
"translation": "Hamburger",
"romanization": "hàn bǎo",
"shape": [
{"x": 10, "y": 40},
{"x": 100, "y": 40},
{"x": 100, "y": 80},
{"x": 10, "y": 80}
],
"colors": {
"fg": [0, 4, 6],
"bg": [240, 245, 255]
},
"score": {
"recognition": 1.0,
"translation": 1.0
}
},
{
"value": "雞香堡",
"translation": "Chicken Burger",
"romanization": "jī xiāng bǎo",
"shape": [
{"x": 10, "y": 140},
{"x": 130, "y": 140},
{"x": 130, "y": 180},
{"x": 10, "y": 180}
],
"colors": {
"fg": [0, 4, 6],
"bg": [240, 245, 255]
},
"score": {
"recognition": 1.0,
"translation": 1.0
}
}
]
This endpoint detects the lines of text in an image, and returns the positions, colors and text of the detected text. It will also return an English translation and an English romanization of the text, if the source language is not English. If the source language is English, the translation and romanization fields will currently mirror the value field.
The available form parameters for this request are:
image
(required) is the image to be used for detection, attached as part of a multipart-encoded request.lcSrc
(required) is short for “language code, source”, and represents the language code of the language that should be detected in the image. See the POST Form Parameters section for available language codes.lcTgt
(optional) is short for “language code, target”, and represents the language that should be translated into, if any. This defaults to English. See the POST Form Parameters section for available language codes.type
(optional) is a hint for what kind of text should be expected in this image. Different Waygo language and OCR models are optimized for different use cases, and when applicable, this hint can boost performance in certain situations. The available types right now aredefault
andreceipt
. Usereceipt
when it is known that the source image contains a receipt, invoice or similar document. Usedefault
in all other cases.
A list of detected labels are returned. Every detected label in the list contains the following fields:
value
is the detected text in the language specified bylcSrc
translation
is the translation of the detected text into Englishromanization
contains a pronounceable version of the detected text. The precise romanization depends on the language. For Chinese, the romanization is pinyin, while for Japanese it is a version of romaji.shape
contains a list of points for the shape that fits around the detected text. In most cases, this will be four coordinates representing the four corners of a rectangle. The coordinates are given in clockwise order.colors
contains two keys,fg
andbg
, short for foreground and background. Foreground is the color of the detect text, while background is the average color behind the text. Both keys contain a list with three integer values between 0 and 255, representing the RGB color value. For example,"fg": [0, 34, 230]
means that the text color has a red channel with value 0, green 34, and blue 230, or expressed in CSS,rgba(0, 34, 230, 1.0)
.isscore
contains two keys,recognition
andtranslation
. Both are values between 0 and 1, with a score of 0 representing very low certainty in the result, and a score of 1 representing the highest certainty of a good result. These values are approximations of how accurate the results are, and can be used to selectively show or filter certain results.
HTTP Request
POST http://waygoapp.com/api/v1/image/detect
POST Form Parameters
Parameter | Required | Default | Description |
---|---|---|---|
image | Yes | - | The image to be used for detection, attached as part of a multipart-encoded request. |
lcSrc | Yes | - | The ISO language code of the source language, or in other words, the language of the text in the image. The accepted language codes are:
If zh is specified, the API will automatically handle both simplified and traditional Chinese text. |
lcTgt | No | en | The ISO language code of the source language, or in other words, the language to translate to. Currently, the only valid option is en (English), and the parameter is not required. |
type | No | default | A hint for what kind of text should be expected in this image. Use receipt when it is known that the source image contains a receipt, invoice or similar document, and default otherwise. |
Errors
The standard error response is structured like this:
{
"code": 418,
"message": "error message",
"fields": ["field1", "field2"]
}
The fields
field will only be set if there is an error related to a specific field in the request, otherwise it may be left out. The message
field will give a human-readable description of the error.
The error code will match the HTTP response code, and possible codes are:
Error Code | Meaning |
---|---|
400 | Bad Request |
401 | Unauthorized |
403 | Forbidden |
404 | Not Found |
429 | Too Many Requests |
500 | Internal Server Error |
503 | Service Unavailable |
iOS SDK
1. Add Waygo SDK to Project
If you are creating a new iOS app, open Xcode and select File > New > Project and select a Single View Application and name it whatever you want. For the purposes of this tutorial, we have named ours WaygoSample. Make sure that your project Language is set to Objective-C. This tutorial is designed for an iPhone so we have set our Devices to iPhone.
Unzip WaygoSDK_iOS.zip.
Create a new group (folder) in your project and name it WaygoSDK. Right click on this folder and select Add Files to “WaygoSample”. In the window that follows, navigate to your WaygoSDK_iOS folder and open the SDK folder. Select all files in this folder (Waygo.framework, opencv2.framework, Waygo.bundle, Waygo.license, public.pem, sign.bin). Select Copy items if needed so that these resources will be copied to your project folder.
Verify that the files have been added correctly by going to the Build Phases tab of your app target settings and expanding the Link Binary With Libraries and Copy Bundle Resources sections.
2. Add Additional Dependencies
The Waygo SDK requires 2 standard libraries to run properly. The libraries are libc++ and libz. Before leaving the previous screen (your app target Build Phases), click on the + at the bottom of the Link Binary With Libraries section.
Select libc++.tbd and libz.tbd from the list and click Add.
3. Configure Project Settings
The Waygo SDK is built for the standard mobile architectures (armv7, armv7s and arm64). Your project must build for at least one of these architectures and cannot be built for anything other than these three (Waygo does not run on armv6, or the simulator). Set your project Architectures accordingly.
Make sure you are using the default Apple compiler, in this case Apple LLVM 7.1.
Next, make sure that your project folder $(SRCROOT) is included in your Framework Search Paths.
The Waygo SDK does not make use of Bitcode so set it to NO in your Project Build Settings.
The Waygo SDK is built with C++. Your C++ compiler settings much match that of the Waygo SDK. Set your C++ Standard Library to libstdc++ (GNU C++ standard library).
Finally, because the Waygo SDK uses C++, you must modify your app’s main.m file extension to main.mm which will enable your app to run C++ code.
4. Initializing the Camera and Waygo
Conform to Camera Protocol
Go to the header file for the view controller (ViewController.h) in which you want to run Waygo from. You need to import <Waygo/Waygo.h>
and make your view controller conform to the AVCaptureVideoDataOutputSampleBufferDelegate
protocol.
#import <UIKit/UIKit.h>
#import <Waygo/Waygo.h>
@interface ViewController : UIViewController <AVCaptureVideoDataOutputSampleBufferDelegate>
@end
Create Camera and Waygo Properties
Go to the class file for the view controller (ViewController.m) in which you want to run Waygo from and create some properties for the camera and Waygo SDK within the @interface
of your class. You will need the following objects:
Waygo *waygoSDK
(SDK object)
AVCaptureSession *captureSession
(Camera capture session)
AVCaptureVideoPreviewLayer *previewLayer
(Camera view display on screen)
CGRect targetBox
(region of preview for cropping and processing in Waygo)
UIImageView *targetBoxView
(visual boundary of single line target box)
#import "ViewController.h"
@interface ViewController ()
// Waygo SDK object
@property (strong, nonatomic) Waygo *waygoSDK;
// Camera objects
@property (strong, nonatomic) AVCaptureSession *captureSession;
@property (strong, nonatomic) AVCaptureVideoPreviewLayer *previewLayer;
// Target box (the region of screen to fit text into and crop for processing)
@property (nonatomic) CGRect targetBox;
@property (weak, nonatomic) IBOutlet UIImageView *targetBoxView;
@end
Initialize the Waygo SDK
You will now initialize the Waygo SDK, set the language pair you wish to translate, and build the target box. In the viewDidLoad
method of your view controller, enter the code shown in the screenshot. During the Waygo initialization, your license will be verified. If your license verification fails, the waygoSDK
object will be nil
in which case you should not proceed with further setup. See the Waygo.h file for all possible language codes.
- (void)viewDidLoad
{
[super viewDidLoad];
// Initialize Waygo SDK object
self.waygoSDK = [[Waygo alloc] init];
// self.waygoSDK will be nil if license verification fails
// check console for verification status
if (!self.waygoSDK)
return;
// Set language pairing code, i.e. CHtoEN for Chinese to English
[self.waygoSDK setLanguageCode:CHtoEN];
}
Initialize the AVCaptureSession
 After building your target box in viewDidLoad
, you will now initialize your captureSession
. The Waygo SDK will take care of most of the set up. You must then set up your camera previewLayer
which displays the live feed that the device camera sees. The code below sets the preview layer to take up the full device screen. Enter the code shown below in viewDidLoad
after initializing the Waygo SDK.
/*We create a capture session*/
self.captureSession = [[AVCaptureSession alloc] init];
self.captureSession = [self.waygoSDK setCaptureSessionForViewController:self];
/*We display the camera preview layer*/
self.previewLayer = [AVCaptureVideoPreviewLayer
layerWithSession:self.captureSession];
self.previewLayer.frame = self.view.frame;
self.previewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill;
[self.view.layer insertSublayer:self.previewLayer atIndex:0];
Implement the AVCaptureSession Delegate
As each new frame of video is captured by your app, the image will be sent to the captureOutput:didOutputSampleBuffer:fromConnection
delegate method. This method is part of the AVCaptureVideoDataOutputSampleBufferDelegate
protocol that we set up in our ViewController.h file. Add the method below to your ViewController.m. We skip processing if the camera is currently adjusting focus. We send the imageBuffer
and targetBox
to the SDK where the image will be cropped using the targetBox
frame and translated. The SDK will return a NSArray
of WaygoTranslation
objects, 1 for each line of text that was translated.
/* This is the camera delegate. As new frames come in, this method is called
and the image buffer is sent to the Waygo SDK for processing. */
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
// We create an autorelease pool because as we are not in the main_queue
// our code is not executed in the main thread. So we have to create an
// autorelease pool for the thread we are in
@autoreleasepool {
// get device handle
AVCaptureDevice *device = [AVCaptureDevice
defaultDeviceWithMediaType:AVMediaTypeVideo];
// ignore frames that arrive while the device is refocusing
if ([device isAdjustingFocus])
{
return;
}
// Get full-size buffer from camera and send to Waygo
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
NSArray *waygoResults = [[NSArray alloc] init];
waygoResults = [self.waygoSDK translateImageBuffer:imageBuffer
fromDevice:device
withinBox:self.targetBox];
//We unlock the image buffer
CVPixelBufferUnlockBaseAddress(imageBuffer,0);
}
}
Now go back to your viewDidLoad
method and start the captureSession
. This will allow the delegate method you just created to be called by captureSession
.
/* If captureSession is not nil, start the session. The captureSession will be nil
if the app does not have permission to access the camera. In this event, you must
notify the user to enable the app to have camera access. */
if (self.captureSession)
[self.captureSession startRunning];
We need to show the targetBox
on the screen to give the user a visual target for where to center the text. Let’s add the target-box.png image from the WaygoSample project folder to our project.
With the target-box image added to our project, let’s assign the image to our targetBoxView
, then set the targetBox
rect equal to the targetBoxView
frame. Add this to the bottom of your viewDidLoad
method.
// Add Target Box Image - this is the region of the camera preview that is
// cropped and sent to the Waygo SDK for processing.
UIImage *targetBoxImage = [[UIImage imageNamed:@"target-box"]
resizableImageWithCapInsets:UIEdgeInsetsMake(10, 10, 10, 10)];
self.targetBoxView.image = targetBoxImage;
Set the _targetBox
rect to match the UIImageView
frame:
_targetBox.origin.x = _targetBoxView.center.x - _targetBoxView.bounds.size.width/2;
_targetBox.origin.y = _targetBoxView.center.y - _targetBoxView.bounds.size.height/2;
_targetBox.size.width = _targetBoxView.bounds.size.width;
_targetBox.size.height = _targetBoxView.bounds.size.height;
[self.waygoSDK updateCaptureSession:self.captureSession
withOrientation:UIInterfaceOrientationPortrait];
The core functionality of the sample project is now complete. You can run the app on an iOS device to check that it builds correctly and you see the camera preview with target box visible on the screen.
Display Results on the Screen
The app doesn’t appear to be doing anything interesting right now. We need to add some text labels to the screen to show the user the translations. Let’s go back to the @interface
of our ViewController.m and add some UILabel IBOutlets
to show the recognized text, the romanized pronunciation, and the translation.
@property (weak, nonatomic) IBOutlet UIView *textDisplayView;
@property (weak, nonatomic) IBOutlet UILabel *textLabel;
@property (weak, nonatomic) IBOutlet UILabel *romanLabel;
@property (weak, nonatomic) IBOutlet UILabel *translationLabel;
Now we need to extract the WaygoTranslation
object from the array returned by the waygoSDK
and display this information on the screen using our labels we just created. Modify the captureOutput:didOutputSampleBuffer:fromConnection
delegate method to retrieve the WaygoTranslation
object from the returned array.
if (waygoResults.count > 0) {
if (self.waygoSDK.isSingleLine) { // single line, display the new results
WaygoTranslation *waygoTranslation = waygoResults[0];
/* We have to push the UI updates to the main thread because
we are currently in the camera thread */
dispatch_async(dispatch_get_main_queue(), ^{
[self displayTranslation:waygoTranslation];
});
}
}
Now let’s implement the displayTranslation
method we are calling from the camera delegate method. This method will take in the WaygoTranslation
object and set the text of our three labels. See the WaygoTranslation.h file for a description of each property.
- (void)displayTranslation:(WaygoTranslation *)results
{
[self.textLabel setText:results.recognizedText];
[self.romanLabel setText:results.romanization];
[self.translationLabel setText:results.translation];
}
Finally, let’s add the Waygo watermark to the bottom right corner of the screen. Add an IBOutlet
for UIImageView *watermarkView
and then attach the image using this code at the end of your viewDidLoad
method.
UIImageView *watermarkView = [self.waygoSDK getWaygoWatermark];
self.waygoWatermark.image = watermarkView.image;
self.waygoWatermark.alpha = watermarkView.alpha;
That’s it! Your project is now ready to translate single line text in real time using the Waygo SDK. Take a look at the included WaygoSample project for some additional features such as switching to Multi Line mode, pausing a translation, implementing a torch, and handling landscape. Run your project and try translating some phrases!
Android SDK
The Waygo Android SDK documentation for offline mobile OCR is not currently available online, but if you are interested in using our Android SDK, please contact us for the SDK key and documentation.