Ads Evaluation Rater Hub

July 25, 2017 | Author: Rafael Sánchez-Vallejo Bey | Category: Advertising, Information Retrieval, World Wide Web, Technology, Cyberspace

Share Embed Donate

Report this link

Short Description

Ads Evaluation Rater Hub As a human evaluator, you are not approving or disapproving the ads you evaluate--you are not ...

Description

Ads Evaluation Rater Hub Welcome to the Rater Hub, a resource for how to use the EWOQ Ads Evaluation System. Click a link in the sidebar to the left to go to a page or click a link in Table of Contents, below, to go directly to a section within the Rater Hub. Table of Contents • •

•

•

Introduction Using the EWOQ System o Using EWOQ Overview o Evaluating a Task (Search Ad Task) o Search Ad Instructions o Reporting a Problem Advanced Topics o Task Switching o Mobile Ad Rating o Ad Formats o Other Task Types o Advanced Problem Solving General Information o Work Requirements o System Information o Glossary o FAQ o Contact Ads Eval

1

Introduction Welcome! You are one of many people across the country and across the world who participate in the ads evaluation program. The following introduction provides some background information that applies to all program participants. Program Purpose What You’re Doing and What You’re Not Doing The Importance of Your Work

Program Purpose Google engineers and researchers are always working to improve Google advertising products for their users: more useful, more relevant, more appealing, and more trustworthy. Our most important tool is information. Knowing what we’re doing well and what we’re doing poorly lets us solve problems and make good things even better. We have many sources of information, but one of our most important sources is people like you, who can review an ad or a web page and provide a human perspective. You will encounter many different kinds of tasks in the course of your work. There are a few main types that you will encounter constantly; you will soon become very familiar with them. These tasks involve looking at advertisements and judging their quality. Other kinds of tasks may appear only once in a while, or show up once or twice and then never again. While most tasks have a very obvious connection to advertising, sometimes the connection might not be as obvious--for example, you might be asked to judge whether two terms mean the same thing, find something on a map, or identify different parts of a sentence. Often this data is used to make specific kinds of improvements or to test different parts of the ads program. We won’t always be able to tell you what the task is for, but it’s always something we consider important. Most of the time, we ask you to put yourself in the place of a person using a Google product and looking at an ad. Would somebody looking for a particular kind of thing also be interested in this other thing? If you were looking for information on this subject, would that be a good page to visit? Is this the product someone was looking for? It can be hard to put yourself in the place of a Google user--especially when it’s a subject matter you don’t know anything about. We rely on your common sense, your research abilities, and your knowledge and experience to help you make these decisions. Other times, we ask you about your own preferences and opinions--is this video interesting? Is this picture ugly? Is this distance too far to walk for an ice cream cone? And other times, we ask you to make objective judgments that should be the same for everyone--is this one word or two? What city is being talked about on this page? What language is this?

2

What You’re Doing and What You’re Not Doing Google shows its users an incredible number of ads each day. The number of ads examined by the Ads Evaluation program is a very, very tiny fraction of the total number. As a human evaluator, you are not approving or disapproving the ads you evaluate-you are not deciding directly which ads get shown and which ads don’t get shown. In fact, even though you are providing scores and feedback about individual advertisements and advertisers, your evaluations don’t generally have any direct effect on those advertisements or advertisers. Sometimes the ads you rate are from a long time ago. Sometimes they are in a different form than they originally appeared. Sometimes they were never shown by Google at all, in any form. Your judgments don’t have a direct impact on the actual ads you rate, but they have a very large indirect impact on all ads at Google. Your judgments are used in many ways, but two are especially important. •

Judging how we’re doing. We use your judgments to determine whether our ads are getting better or worse over time, to see whether particular kinds of ads are working well or not, to see what kinds of ads cause problems, and to see what kind of impact a particular change to our systems will have. When Google engineers design new systems or propose changes to existing systems, the human evaluation data you provide is used to judge whether those proposals succeed or fail.

•

Training our systems. Your judgments are fed to machines that use this information to learn how to make automatic judgments about quality. Every ad you rate is one more piece of information that our systems use to improve the quality of our ads. The evaluations you provide are also used by engineers in the process of designing those systems and coming up with new ideas on how to improve Google’s advertising products.

The Importance of Your Work Your work is important! You won’t often hear about the results of your work. In fact, sometimes it might seem like your work just flows into a black hole--you may sometimes see similar ads over and over and repeatedly identify the same problems. Even though you don’t always see the impact, your work is very important, and many people at Google review it very, very closely. We thank you for your contributions.

3

Using the EWOQ Ad Rating System The EWOQ Ads Rating System is a web browser-based system for evaluating different types of tasks. It is through the EWOQ system that you perform your work. This section introduces you to the various parts of the EWOQ system. Using EWOQ Overview reviews the links you find at the top of every page in the EWOQ system. Evaluating a Task walks you through the evaluation of a single task. The Search Ad Rating Instructions link takes you to the specific instructions used to evaluate standard Search Ad Rating tasks, the most common task performed by raters. Finally, the Reporting A Problem link provides guidance on how to handle tasks you might need to report to the admin team. Using EWOQ Overview Evaluating a Task Search Ad Rating Instructions Reporting a Problem

Using EWOQ Overview In EWOQ, the basic unit of work is the task: you make a request to acquire a task, the system assigns you a task and gives you instructions. Once you acquire a task, you review the instructions, evaluate the task, answer questions, submit your work, and move on to the next item. You acquire tasks on the Rating Home page:

All work starts here. When you click the button to acquire your next task, the EWOQ system determines which type of task to assign you depending on availability and your level of experience. New raters start with Search Ads Tasks, and will acquire other task types as they gain experience.

The Black Navigation Bar When you load the EWOQ Ad Rating System, your initial location is the Rating Home page. At the top of the window is a black navigation bar, which is also found on every page in the system. On the left are five links: Rating Home, Rating History, Rater Hub, Manage Account, and Old Rating System. On the right is a box that contains a number, followed by your email address and a Sign Out link. Each of the links are explained below:

Rating Home

4

Rating Home contains an "Acquire Next Task" button, which you use to acquire a new task to evaluate.

Rating History If you're interested in reviewing the number of new tasks you've submitted over a period of time, click the Rating History link. It takes you to the My History page where you can review your new task submission history. The "Today" button lists the new tasks you have rated today. "Last Week" shows the new tasks you have submitted over the previous seven days. "Last Month" shows the new tasks you have submitted in the last month. You can also select date ranges and the "Search" button to view the new tasks you have submitted over a date range you choose. When the tasks are displayed, you can click on the "Show Details" button to see the exact number of new tasks submitted on a given day.

Rater Hub The Rater Hub provides general information about the Ads Rating Program, using the EWOQ Ads Rating System, work and system requirements, glossary, and frequently asked questions.

Manage Account This link contains information about your account specific to your work as an ads quality rater. Information entered here is not linked to any other Google services.

Old Rating System If you started your term as an ads quality rater in Old EWOQ, this link takes you to that interface. If you have only worked in the current Ad Rating System (New EWOQ), there is no reason to go to Old EWOQ unless we specifically indicate there is work in the old system for you. Only raters who have been asked by the Ad Rating Administrative Team to devote some of their time to tasks in Old EWOQ need to click on this link.

Your Email Address The email address you use to access the EWOQ Ad Rating System is displayed here.

Sign Out Clicking this link signs you out of the EWOQ Ad Rating System.

5

Rating A Search Ad Rating Task As an Ads Quality Rater (AQR), the basic, most common task you acquire is a Search Ad Rating Task. In this section you will learn: • how to acquire a task, • how to recognize a Search Ad Rating Task, • each component of the user interface of a Search Ad Rating Task, • how to answer each question, • how to submit your rating. Step 1: Acquire a Task To acquire a task, click the Acquire Next Task button at the top of the Rating Home.

(Your Gmail address appears in the black navigation bar at the top of the page.) Step 2: Locate the Task Type and Read the Instructions Once you acquire a Search Ad Rating Task, the first page you see looks like this:

Notice the red header at the top of the page. For the next 30 minutes or so, you will be assigned a task type that involves Search Ads Evaluation. These are the instructions for all Search Ad Rating Tasks. Please read these carefully. Once you

6

have reviewed them and feel comfortable with the guidelines within, click the Continue to Task button in the bottom or top left-hand corner of the screen.

Step 3: Understand the User Query and Rate the Ad Creative In this step, you will: (1) Look at the query and figure out the user intent. If you are not sure of the user intent or do not understand the query, you must research the query. (2) Look at the approximate query location, if it is available. (3) Study the ad creative displayed. (4) Select any flags, if applicable, and rate the ad creative based on the criteria in the task instructions. (5) Make a comment if necessary. (6) Proceed to the next page (Landing Page Rating). When you continue to the task, you see this:

7

8

The red numbers in the screenshot above represent the following: 1. Search Ads Evaluation (Rating Language : Rating Country) - This indicates the type of task you are currently working on (a Search Ad Rating Task). In this case, you are evaluating search ads. Next to the task type, you see the rating language and country. In the screenshot above, the rating language is English and the rating country is the United States. If you were hired to work in a different language, tasks in your target language have that language and rating country displayed here. (Remember: Always rate from the perspective of a user in the rating country, and pay attention to which language you are currently rating in if you work in multiple languages!) 2. Task Instructions - Click the triangular toggle to open the task instructions for this task within the page if you want to refer to them when rating. If the instructions are long, you must scroll down to the bottom of the page to advance to the task rating page. Close the instructions by clicking on the toggle. 3. Query - This is the query (set of words, numbers, symbols, or all three that a user types in the search box of a search engine) you should use to understand the user intent. If you do not fully understand the user intent, research it by clicking on the query itself. This opens up a separate browser window or tab and shows Google search results for the rating country and language for this task. Your understanding of the query is required; however, you only need to research the query if you don't understand the user intent. 4. Query Flag - There are four labeled checkboxes that each have a specific meaning defined in the task instructions: - Foreign Language Query - Nonsense Query - Porn Query - Russian Transcription Error Click on a box if you determine that the criteria are met for that flag. You can choose as many flags as you feel apply to the ad creative, but you should only use flags when appropriate. Not every task requires a flag. Please review the task instructions for when to use flags. If you select one of these flags, the rest of the task becomes disabled and the Submit button becomes enabled. You do not have to provide any further ratings for this task, but may proceed directly to submitting it from the AC page (see number 11).

5. Approximate Query Location - This is a map indicating the geographical region where the query was entered. If you are unfamiliar with the location in the map, you can zoom in and out of the map to familiarize yourself with it. If this information is unavailable, either the labeled section will be absent or the words “Location unknown” will be displayed in place of the map, depending on the task. 6. Ad Creative - This is the advertisement that you evaluate in relation to the query and according to the criteria described in the task instructions. Imagine that this ad appears on a Google search results page after the user enters their query. Ignore any ads you see on the search results page; you are only evaluating the one ad we show you on the Search Ad Rating Task page.

9

NOTE: Do not click on any of the links in the ad creative! You should only evaluate what you see in the Ad Creative box in EWOQ. 7. Ad Creative Flags - There are five labeled checkboxes that each have a specific meaning defined in the task instructions: - Navigational Bullseye - Foreign Language - Unexpected Porn - Used Prior Knowledge in Judging Advertiser - Secondary Interpretation of Query Click on a box if you determine that the criteria are met for that flag. You can choose as many flags as you feel apply to the ad creative, but you should only use flags when appropriate. Not every task requires a flag. Please review the task instructions for when to use flags. If you select the Foreign Language or Unexpected Porn flag, the Ad Creative Rating becomes disabled. You do not have to answer this question.

8. Ad Creative Rating slider - Here you must answer the question: How promising is this ad? This slider is divided into four answer ranges (Very Unpromising, Somewhat Unpromising, Somewhat Promising, Very Promising) on a number scale from -100 to 100. The slider allows you to enter a rating at any point by moving the marker along the scale. Move the marker to the appropriate rating based on your evaluation of the ad creative. The little red flag will disappear when you select a rating. Remember to see the task instructions for criteria for each range, and see the FAQ article on the Rater Hub for more information about rating with sliders (“How do I rate using a slider?” and “What if I never seem to use a certain part of the slider?”).

9.

Comment box - Enter your comments into this box.

10. Proceed to Landing Page Rating button - Once you have answered all questions on this page, this button becomes enabled and you may proceed to the next page by clicking on it.

10

11. Submit button - On this page, the Submit button is disabled. If you select one of the Query Flags, the rest of the task becomes disabled and the Submit button becomes enabled, allowing you to submit the task directly from the AC page. Otherwise, this button will remain disabled until you have completed both sections of the task.

Button is disabled. 12. Report a Problem link - If you encounter a technical problem with this rating task, please use the “Report a Problem” link in the lower-right hand corner of the rating page. This will open a Help panel. See the section Reporting a Problem for information on when to report problems, and for more details about using the Help panel. Note: By submitting a problem report, you will skip the task and lose all work you have done on it.

11

13. Skip this Task link - If for some reason you need to skip this rating task, you may use the “Skip this Task” link in the lower-right hand corner of the rating page. Skipping the will take you directly to your next task. Note: By skipping a task, you will lose all work you have done on it.

Step 4: Visit the Landing Page and Rate It After clicking on the Proceed to Landing Page Rating button, you arrive at a page that looks like the one below. Here you will do the following: (1) Look at the query once again. (2) Visit the landing page for the previous ad creative. (3) Look at the approximate query location again, if available. (4) Select applicable flags and rate the landing page based on the criteria in the task instructions. Note: Your landing page rating should be independent of the score you gave for the ad

12

creative. You should judge the landing page in relation to the query (user intent). (5) Make a comment if necessary. (6) Submit your ratings for the whole task!

You have the option to view the task instructions at any time by clicking on the toggle to open them.

13

The red numbers in the screenshot represent the following: 14. Query - Once again, the query is displayed here. 15. Approximate Query Location - Once again, a map indicating the geographical region where the query was entered is displayed here, if that information is available. 16. Visit Landing Page button - Click on this button to open the landing page for the ad creative you just rated. (Note: Clicking on the ad creative on the first page does not take you to the correct landing page. The only landing page/site that you should rate is the one accessible via this button.) 17. Landing Page Flags - Similar to the Ad Creative Flags, for the landing page, there are six labeled checkboxes that each have a specific meaning defined in the task instructions: - Navigational Bullseye - Foreign Language - Unexpected Porn - Unexpected Download - Error / Did Not Load - Secondary Interpretation of Query Click on a box if you determine that the criteria are met for that flag. As before, you can choose as many flags as you feel apply to the landing page, but you should only use flags when appropriate. Not every task requires a flag. Please review the task instructions for when to use flags. If you select the Foreign Language, Unexpected Porn, Unexpected Download, or Error / Did Not Load flags, the Landing Page Rating becomes disabled. You do not have to answer this question.

18. Landing Page Rating slider - Here you must answer the question: Does this page satisfy the user intent? This slider is divided into four different answer ranges (Dissatisfaction Likely, Dissatisfaction Possible, Satisfaction Possible, Satisfaction Likely) on a number scale from -100 to 100. The slider allows you to enter a rating at any point by moving the marker along the scale. Move the marker to the appropriate rating based on your evaluation of the landing page. (Like before, the little red flag disappears when you have selected a rating.) Remember to review the task instructions for criteria for each range, and that these are different from those applied to the ad creative. See the FAQ article on the Rater Hub for more information about

14

rating with sliders (“How do I rate using a slider?” and “What if I never seem to use a certain part of the slider?”).

19.

Comment box - Enter your comments into this box.

Remember, you also still have the option to click the “Skip this Task/Report a Problem” link at the bottom of the page. 20. Submit button - Once you answer the question on this page, the Submit button becomes enabled. You can submit your ratings for this task.

Button is disabled.

Button is enabled! You can submit! You will automatically receive your next task once you click Submit.

15

Search Ad Instructions During your time as an Ad Quality Rater, you should expect to encounter several different task types in the ad rating system. Instructions are always accessible at the top of all tasks. The most common task type is the search ad evaluation and instructions for this task type are included below. These instructions are identical to those found on Search Ads Evaluation tasks. Search Ads Evaluation: General Guidelines Version: 2015-08-20

READ INSTRUCTIONS CAREFULLY! These ratings use newly updated ads rating guidelines that are different from earlier versions of the guidelines. If you have not read the guidelines since the version date listed above, you must carefully read or review all instructions before rating.

Search ad rating involves interpreting a user query. A user query is the set of keywords that a user enters into the Google search engine. When rating a search ad, perform the following steps: 1. Review the Google search results page, try to understand the user query, and form an opinion about what the user hopes to accomplish by using a search engine. 2. Use the evaluation criteria found in the following instructions to analyze an advertisement and the advertising experience the user will have if he or she clicks on the ad.

User Intent An understanding of the user intent is necessary to accurately rate a search ad. The user intent is what the user hopes to accomplish by using the Google search engine. Note that users use the search engine to look for a variety things, and there are many user intents. Some queries are very easy to understand, others are more difficult, and some may seem impossible to understand. Regardless of its meaning, you must research the query and form an opinion about the user intent. We strongly advise you to review the Google search results page to determine user intent. In order to objectively determine how promising or unpromising an advertiser offering is for a particular user query, it is important to form an opinion about the user intent before beginning an analysis of the advertisement.

Queries with Multiple Meanings If a query has multiple meanings, please consider that all meanings can be placed on a spectrum between plausible meanings and highly implausible meanings. When analyzing an advertiser offering, consider what meaning the advertiser uses and where it falls along this spectrum. This will help you determine the appropriate search ad rating.

Plausible Meanings If a query has several plausible meanings, it is important to consider them all. If an advertiser assumes a particular meaning in an ad or on a landing page, and it is a reasonable meaning, assume that is the meaning that the user intended.

16

Refer to the following example to better understand plausible meanings: User query: [ java ] This query could refer to an island, coffee, or a computer language. With no additional information available, it is impossible to say which meaning the user intended. Ads that respond to any of these meanings are acceptable since all three meanings are reasonably plausible.

Possible but Unlikely Meanings If an ad or landing page assumes a meaning that is possible but not very likely, this a secondary interpretation. An ad or landing page that addresses only a secondary interpretation of the query is given a lower rating than an ad that addresses a plausible meaning. Use the Secondary Interpretation of Query flag in this case. Generally, rate an ad or landing page that responds to a secondary interpretation negatively. Refer to the following example to better understand possible but unlikely meanings: User query: [ paris ] While there are a number of cities called Paris, unless there is some reason to think that one of the smaller cities is meant, a query mentioning Paris is probably referring to Paris, France. So, an ad for hotels in Paris, Texas instead of Paris, France is probably incorrectly comprehending the user intent. Even if the ad is otherwise a good one, rate it on the negative side of the scale and use the Secondary Interpretation of Query flag.

Implausible Meanings If an ad or landing page assumes a meaning that is completely implausible, treat it as completely wrong and choose a very negative rating. Do not use the Secondary Interpretation of Query flag if the meaning is clearly implausible. Refer to the following example to better understand implausible meanings: User query: [ paris ] The query probably refers to the city of Paris, France. If an advertiser interprets the meaning to be plaster of paris, it is almost certainly not addressing the user query. Use a very negative rating.

Misspelled Queries Users often misspell queries. When evaluating a query, if it is clear what the user means, and the misspelled version of the query has no meaning, ignore the misspelling.

17

Analysis is more difficult if the query appears to be a misspelling, but the misspelled version has a unique meaning. First consider the query as the user entered it, and then consider if it may be misspelled. If advertisers respond to misspellings, ratings may need to be adjusted. Refer to the following example to better understand misspelled queries: User query: [ goodnight moom ] There is a famous children’s book called Goodnight Moon. It is very possible that the user means to type [ goodnight moon ] but types [ goodnight moom ] instead. However, there is actually a novel titled Goodnight Moom. While the novel is quite obscure, it might be what the user wants.

Advertiser Responds to Actual Spelling in Query If the advertiser assumes that the query is correct as it stands (in the example above, assumes the user meant [ goodnight moom ]), treat the advertiser’s query interpretation as acceptable. You would then need to decide separately how promising the ad or landing page are.

Advertiser Responds to a Plausible Correction of Spelling in Query If the advertiser assumes that the query is misspelled and addresses a corrected version of the query (in the example above, assumes that the user mistyped and meant [goodnight moon]), judge for yourself whether this was a good assumption. If you think it was a reasonable assumption, the ad and landing page are treated as if this were the user intent. Don’t modify your scores to account for the spelling correction, and don’t use the Secondary Interpretation of Query flag.

Advertiser Responds to a Possible but Unlikely Correction of Spelling in Query If the advertiser assumes a corrected spelling that you think is possible but not very likely, this is a secondary interpretation. An ad or landing page that addresses only a secondary interpretation of the query is given a lower rating than if it had responded to a likely or plausible meaning. The Secondary Interpretation of Query flag must be used. An ad or landing page that responds to a secondary interpretation is generally rated negatively.

Implausible Spelling Correction If the advertiser’s interpretation of the query is based on a completely unreasonable assumption, treat it as completely wrong and give it a very negative score. The Secondary Interpretation of Query flag is not used in this case. Continuing with the previous [ goodnight moom ] example, both query interpretations are reasonable. The only way to know this is to research the query and analyze how the advertiser interprets it.

18

Queries for which a Reasonable Ad is Impossible Sometimes a query is either so hard to interpret or so non–commercial in nature that no ad will be a good match. Be careful in these cases—rate the ad and landing page according to how well they actually respond to the query, and do not worry about how hard it would be to show an appropriate ad for that query. Do not give an ad positive ratings if a better ad for the query cannot be determined or if it seems like it was a good try. Rate it positively only if it addresses the query intent. If the query intent cannot be determined, the ad must be rated negatively. For example, a query of [ www ] or [ when did ] is not complete enough to serve a proper ad.

Unrateable Queries In some rare cases, a query may appear that is the result of an error in how the task was added to the evaluation system. For example, a query may appear in the incorrect rating language, or a query of jumbled characters may appear that, after research, has no discernible meaning. Do not attempt to provide AC or LP ratings for queries like this. Instead, use one of the flags provided in the Query Flag section. These flags are only present on the first page of a Search Ads task, the Ad Creative rating section.

Approximate Query Location Many tasks will include a map indicating the geographical region where the query was entered. If you are unfamiliar with the location in the map, you can zoom in and out of the map to familiarize yourself with it. Knowing the approximate location where the query was entered may help you decide how relevant the Ad Creative or Landing Page is to a user. Some ads will be more relevant given the location, some will be less relevant. For some ads, knowing the location will not make any difference.

When Location Matters Sometimes the query, the ad, or both may refer to a specific geographic location. Even when the approximate query location is available, it can sometimes be difficult to determine how to handle a location-specific advertisement.

Location Mismatch Between Query and Approximate Query Location If the query contains geographical information, you should use the information available in the query instead of the map, especially if the two conflict. For example, if the query is [pizza in new york], but the map indicates Los Angeles, pizza restaurants in Los Angeles are a bad result.

Query Specifies a Location but Ad Does Not If a query specifies a location, take it into account when evaluating the ad. Sometimes research is required to determine whether the product or service is compatible with the location. This research is required before you can choose an Ad Creative rating score. User query: [ pizza Santa Monica ] If an ad pointed to the main Round Table Pizza chain homepage, but didn’t mention this California city, this might initially seem useful.

19

However, if, upon using the location finder on the site, there are no locations within a reasonable distance of Santa Monica, this ad is probably not useful.

Ad Specifies Location but Query Does Not If the query does not specify a location and the approximate query location is also not available, then evaluate the ad as if the user were in an appropriate location. For example, if the query is [ pizza ] and the ad is for a pizza restaurant in Barstow, California, assume that the user was in Barstow, California.

Neither Query Nor Ad Specify a Location If the query does not specify a location and the ad does not either, then assume that the user can be anywhere in the target country of the rating language. Ignore the approximate query location, if one is provided.

Location Mismatch Between Query and Ad Assuming there is a match between the product or service and the query and ad, take the location proximity into account in evaluating an ad. If the ad for a given query specifies a different and incompatible location, this makes it a worse ad. User query: [ pizza Santa Monica ] If the ad is for a pizza restaurant in Manhattan, this is very unpromising. However, an ad returned for pizza in a different but nearby location, like a neighboring town, could be useful, and this ad might not be as bad as the previous example. Use common sense to determine if the ad exceeds a reasonable distance for the user, and an acceptable distance may vary depending on the query. For example, a user may be willing to travel farther to buy a new car than to get a haircut or go to the supermarket. For certain queries, serving an ad with a completely different location may still be promising. For example, if a user in the United States is looking for [ vacation in Australia ], then an ad for “vacation in New Zealand” is not necessarily a bad ad since the user is likely to be willing to travel a long distance for a vacation.

Search Ad Rating: How Promising is this Ad? Evaluate the Ad Creative. A promising ad is one that seems like it will give the user what he or she wants. An unpromising ad looks like it will be disappointing, unhelpful, dangerous, or irrelevant. Use the slider bar to select from four possible ratings: § Very Promising § Somewhat Promising § Somewhat Unpromising § Very Unpromising Very Promising and Somewhat Promising are positive ratings: use them for ads that look like they’re good ads for the user to see that would be worth clicking on. Somewhat Unpromising and Very Unpromising are negative ratings: use them for ads that look like they’re bad ads for the user that aren’t worth clicking on. Consider the following factors while evaluating an ad:

20

Satisfying the User Does the ad offer something that will satisfy the user? An ad that has nothing to do with what the user wants is always very unpromising. An ad cannot be promising if it is not relevant. However, being relevant is not enough to make an ad good.

Correct Meaning of the Query Does the ad address the correct meaning of the query? An ad for a car dealership does not address the query [ cars 2 ]—that is the name of a movie. Even if it would be a good ad for some other car-related query, it is a completely terrible and unpromising ad for that query.

Clarity and General Appeal Is the ad written in a clear, appealing way? An ad that makes sense and does not have mistakes, hard-to-understand language, or awkward phrasing can be good; and, an ad that looks stupid, looks like it was written by a machine, is unintentionally funny, or just does not make sense is usually bad. Does the ad clearly state what the advertiser offers? A good ad is easy to understand. A bad ad may be overly vague or may not communicate enough information to conclude the ad will lead to a positive advertising experience.

Potentially Scammy Ad Does the ad look like a scam? An ad that seems too good to be true, sleazy, or deceptive to users is usually bad.

Promise of Additional Links Some ads contain multiple links to different sections of the website. You do not need to click on these links, and when you are rating the creative, you cannot click on them. However, if these links look promising or useful, this may be a reason to increase your ad creative score. If links look unpromising, confusing, or useless, this may be a reason to decrease the ad creative score. See the Ads with Additional Features section for more detailed guidance.

Advertiser is Different Merchant or Provider from Query Sometimes a query will specify both a product or service and a particular merchant or provider. If the ad offers the desired product or service from a merchant or provider that is not the one specified in the query, it should not be considered a negative user experience unless there is another issue with the ad.

Analyzing the Advertiser Visible URL The web address (also called the visible URL) displayed in the ad can provide clues about how promising or unpromising an ad may be. The visible url can affect your evaluation in the following three ways:

21

•

•

•

If you are familiar with the advertiser based on the URL displayed in the ad, you may use your background knowledge when rating the ad. Just use the “Used Prior Knowledge In Judging Advertiser” flag. If you aren't familiar with the merchant, assume the merchant is legitimate, even if that's not how you behave in your own online activities. Important note: this only applies if there's nothing in the URL that looks suspicious (see next bullet point). If the URL itself makes you suspicious, don’t hesitate to mark the ad bad. For example, an ad for online book shopping that looks very good except that the URL of the merchant is www.amazom.com is pretty suspicious--that looks like the merchant is trying to trick you into thinking you’re going to amazon.com. This is likely a scammy ad, and if you think an ad is scammy, it deserves a bad rating. (Other tricks of this sort in addition to misspellings in URLs include adding random numbers, unexpected extensions, or subdomains to create URLs like www.amazon22.com, windows7onet.in, or windows7.customersupportus.net)

Do not assume an ad is promising just because it contains the same words as the query. Do more than note that the words match—machines can tell us this. We need human judgment: tell us whether a human being will find an ad appealing. If a user is looking for [ blue pants ] an ad that says “PANTS BLUE BLUE PANTS www.bargainautoparts.com” is likely a bad ad even though it has the words “blue pants.”

Distinguishing Between Very Promising and Somewhat Promising If an ad looks like it will lead to a page that satisfies the user intent, it deserves a rating of Somewhat Promising or Very Promising. A good ad deserves one of the ratings described in the following two sections.

Very Promising A Very Promising ad should look like it points to a page where a user can find almost exactly what is described in the query. If the user is looking for a particular product, the ad should appear to point to a page for that product. If the user seeks a particular kind of store, the ad should appear to point to a store of that kind. If the landing page of a Very Promising ad does not satisfy the user intent, it will be a surprise and a disappointment to the user.

Somewhat Promising

22

A Somewhat Promising ad should also appear to take the user to a page where the product he or she is looking for can be found; however, rather than appearing to point to a page where the user can find exactly what he or she wants, a Somewhat Promising ad might do one of the following things: • It might look like it points in the right direction but not exactly at the target. For example, if the user seeks a specific model of camera, an ad that looks like it will point to a reputable camera store’s main page is Somewhat Promising. • It might look like it points to something that might satisfy the user intent but is not exactly what he or she wanted. For example, if the user seeks a particular model of camera, a Somewhat Promising ad might point to a slightly different but reasonably similar model of camera. Sometimes it is just not possible to be confident about what the user seeks. If an ad seems to point in the right general direction but there is no way to tell exactly what the user wanted, Somewhat Promising is the highest possible rating.

Distinguishing Between Positive and Negative Ad Creative Rating Evaluate the Ad Creative and weigh the criteria for Very Promising/Somewhat Promising against the criteria for Somewhat Unpromising/Very Unpromising. If you find that positive elements and negative elements both seem applicable to the creative you’re evaluating, ask yourself which side of the positive/negative division seems to be a more reasonable fit and choose a rating on that side.

Distinguishing Between Somewhat Unpromising and Very Unpromising It is especially important to distinguish between ads that are simply bad and ads that are very bad for the user entering the query. The following section provides additional guidance to distinguish between the Somewhat Unpromising and Very Unpromising ad creative rating.

Somewhat Unpromising A Somewhat Unpromising ad generally isn’t a great ad to show the user, but it is likely that some subset of users may find it useful. • Even if the creative is not of the exact same topic as the query, as long as there is some clearly related task or intent, there are some users who may find the creative appealing. One example is if the user seeks [ weight loss pills ] and the ad is for “diet tips” or “exercise machines”. These types of ads should be rated as Somewhat Unpromising. They don’t directly provide what the user is looking for, but could be somewhat useful to the user so don’t deserve the lowest ratings. • If it is not really clear whether users will find the ad useful, rate it as Somewhat Unpromising. One example is if a user is searching for some information and the ad asks the user to search for the same information again elsewhere. It is hard to know in these cases whether the ad will be able to provide anything useful to the user, since he/she is being asked to repeat the same action again possibly just to get similar results. Please view the Google search results for the query to get an understanding of what the user currently sees and what information he/she currently has access to. If you believe that the ad won’t provide

23

•

any additional information from what is already presented to the user, rate it as Somewhat Unpromising. However, if you believe that clicking the ad will provide additional useful information to the user, don’t rate it as Somewhat Unpromising-- give it a higher rating. One example that would deserve a higher rating than Somewhat Unpromising is if the user is searching to buy a particular item and the ad is asking the user to search for that particular item across numerous stores and merchants. Another example that would again deserve a higher rating is a query for some specific industrial machinery part, and an ad inviting the user to repeat the search on a search engine devoted to machine part sales and manufacture. Sometimes a query specifies a location, and the ad targets a different location. For these specific examples, please refer to the When Location Matters section.

Very Unpromising There are several cases where Very Unpromising is the only appropriate rating. • Very Unpromising ads have no reasonable chance of satisfying the user. Try to put yourself in the user's mindset - is it possible at all that the creative offers something useful to the user? If there is no reason at all to think that the user will find the creative useful, rate it Very Unpromising. (Note: you might think “It’s always possible that someone might find anything useful, even though it has nothing to do with the query.” Don’t go that far!) • If the creative looks like a scam, or leads the user to harm, rate it as Very Unpromising. • If the creative falls into one of the categories listed in the MachineGenerated Ads section, rate it as Very Unpromising. • If the creative promises to do the impossible, such as selling a person or city, rate it Very Unpromising. • Just because there is a strong term overlap between the query and creative does not mean the ad is a good match for the query. If the user is searching for [ homeowners insurance ] and the ad is for “medical insurance,” the user will very likely not find the creative useful and you should rate it as Very Unpromising.

Machine-generated Ads Some ads are partially auto-generated to take words from the query and place them in the creative text. There is nothing wrong with this in itself. For example, if the query is [ xbox 360 used ] and the creative says “Buy a used xBox 360 on eBay,” that’s a good ad. Unfortunately, sometimes these machine-generated ads turn out very badly. Very Unpromising ad creatives may have some of the following issues: Things offered that cannot be bought User query: [ san diego, ca ]

24

An ad that says “Buy San Diego cheap on eBay” is ridiculous--you can’t buy a city. Ads that are unintentionally ridiculous, horrible, or offensive, by suggesting that you can buy concepts, human beings, body parts, criminal acts, or similar things are Very Unpromising. Part of the query removed, substantially changing the meaning User query: [ roses lime juice ] An ad that offers the action, “Buy roses,” has left out so much of the query that the entire meaning has changed. By taking only part of the text of the query what remains substantially changes the meaning. Part of the query removed, resulting in overly general ad User query: [ how do i remove gum from satin ] An ad that offers “Get information on how to remove,” is nearly meaningless: too much has been removed from the query. By taking only part of the text of the query, the result is far too general to be promising for the user query. Nonsensical, jumbled, or ungrammatical ad creative User query: [ how do i remove gum from satin ] An ad that says “Search for how do I remove gum” or “Find how do I remove gum from satin” is awkward and ungrammatical. Ads that end up nonsensical, jumbled, or ungrammatical because a query has been crammed into a space where it doesn’t really belong is Very Unpromising. Be on the lookout for these. If you’re not paying close attention to how the ads actually look and sound, it can be easy to think these look fine—but to a user who is actually reading the text, they can look laughable, annoying, or foolish, and in some cases, deeply offensive or hurtful. Even those that just look sort of silly or awkward are very bad.

Ads with Additional Features Some ad creatives are just text and a single link to the advertiser page. Other pages contain additional features that may or may not provide something valuable to the user. Ad creatives may contain maps, videos, images, star ratings from customers, multiple links to specific pages on the advertiser site, and a variety of other features.

25

These additional features may affect the quality of an ad creative. If the special features add to or detract from the appeal, informativeness, or usefulness of an ad, the Ad Creative score can be raised or lowered. For an ad that contains only text and a single link to an advertiser page, use only the previous criteria in making a decision. For ads that contain anything in addition to text and a one link, consider the following factors, and decide whether to raise or lower your score: • If an ad does not deserve a score of Somewhat Promising or Very Promising based on the previous criteria, be cautious about giving it a positive rating just because of additional features. • An ad that deserves a score of Somewhat Promising or Very Promising based on the previous criteria can be given a negative rating if additional features detract from it severely. • Use common sense when deciding whether additional features improve or detract from an ad enough to move it between Somewhat Promising and Very Promising scores in each category. • An ad that is scammy or harmful can never be improved by additional features. • Additional feature should relate to the user intent in a sensible way. If the user is looking for information about a current movie, a movie trailer in the ad creative relates to the user in a sensible way, but a map to the movie studio where the film was made does not. The trailer probably improves the experience, but the map detracts from it. • Where an additional feature is relevant to the user intent, it should be informative, easy to use, and clear. If it is confusing, boring, or hard to figure out, it may either detract from the experience or just fail to improve it. • An ad may have multiple additional features. Consider all of them together when determining your ad creative score. • Raise or lower your rating by a small amount if the additional feature has little impact on the ad creative. Raise or lower it a large amount, according to the previous criteria, if the additional feature has a big impact. • Do not consider the physical size of an ad creative rating it. If the size of an ad creative causes it to display incorrectly in the ads rating interface, alert an administrator but ignore the issue while rating it.

Search Ad Rating: Does Landing Page Satisfy User Intent? Use the slider bar to select from the following four rating categories while determining how likely it is that a landing page will satisfy the user intent. Only consider the user query and the landing page, and ignore the ad creative completely. § Satisfaction Likely § Satisfaction Possible § Dissatisfaction Possible § Dissatisfaction Likely Satisfaction Likely and Satisfaction Possible are positive ratings that satisfy the user query. Dissatisfaction Possible and Dissatisfaction Likely are negative ratings that do not satisfy the user query.

26

Only rate the landing page that opens after clicking on the Visit Landing Page button. Do not base your score on pages that are accessible by clicking on links in the body of the ad creative. NEVER copy and paste a link to visit the page, and NEVER manually change the URL.

The fundamental principle of landing page evaluation is this: the user starts a search on Google.com with a goal in mind. The user then enters a query and reviews Google’s search results and ads. The user then clicks on the ad currently being reviewed, and that ad takes the user to the landing page. Keep in mind that in order for a user to have a positive experience with an advertiser landing page, he or she should be closer to the goal expressed in the query, otherwise it is a negative experience. The section below helps frame how distance from the user’s goal helps determine a landing page rating score.

Distance from the User’s Goal Carefully review the Google search results page to determine the distance from the user’s goal. Does the Landing Page move user closer to his or her goal, further from the goal, or neither closer nor further from the goal? If the user is closer to the goal, the landing page deserves a positive rating. For example, if the user is hoping to buy a specific camera, and the landing page is a store offering that camera for sale, the user has come closer to accomplishing his or her goal. If the user is further from the goal, the landing page deserves a negative rating. If the user is hoping to buy a specific camera, and the landing page is a store offering pet food, this is a dead end. The user will need to go back to the search page or start a new search, so he or she is actually further from the goal than before clicking on the ad. If the user is neither closer to nor further from the goal, the landing page deserves a negative rating. If the user is on a Google search results page and clicks on an ad that just takes them to a page of similar search results, which overall did not provide any additional value, no progress has been made; the user is no closer to or further from the goal than before clicking the ad. Deciding this is not an exact science. Rely on good judgment. The following guidelines more deeply explain how to generally rate landing pages, but they do not explain how to rate a landing page in every situation.

27

Distinguishing Between Satisfaction Likely and Satisfaction Possible Satisfaction Possible and Satisfaction Likely are positive ratings. If the landing page offers the user exactly or very nearly what he or she wants, use a Satisfaction Likely or Satisfaction Possible rating.

Satisfaction Likely To receive this rating, a landing page must offer just what the user looked for. If the user wants car reviews, it should offer car reviews. If the user wants car reviews about a specific model, it should offer car reviews about exactly that model. If the user wants a category of product, the landing page should be devoted to or include that exact category of product. For a Satisfaction Likely rating, what the user is looking for should be apparent with no additional action needed by the user. It is permissible, however, to click on a link to get detailed information.

Satisfaction Possible Use this rating if the page is satisfactory but does not immediately present exactly what the user seeks. If the product or service is for sale on the site, but a search or straightforward navigation is required to find the item, select a rating of Satisfaction Possible rather than Satisfaction Likely. If the site offers a very plausible substitute for a particular product specified in the query, it may receive a rating of Satisfaction Possible or lower. If the query is a search for information, and this information can be found without too much trouble on the advertiser site but is not on the landing page, use Satisfaction Possible. The one exception here being if the user could have found that same information on the search results page before clicking on the ad. If that is the case, the landing page does not deserve a positive rating.

Considering Trustworthiness Do not give a landing page a Satisfaction Possible or Satisfaction Likely rating if you do not trust the information found on that landing page or if you would not make a purchase from the advertiser site. A page that offers the exact product that a user is looking for is useless unless the user trusts it enough to actually make a purchase there. A seemingly trustworthy merchant selling a particular camera at a particular price might deserve a better rating than a page that clumsily aggregates a random set of products, even if the same camera at the same price is offered on that page too. Similarly, a page offering the exact information that the user is looking for is not useful if there is no reason to think that the information is correct. For example, if the user seeks some medical information, a site belonging to a medical school is a good source of trustworthy information while a blog post by an unknown person is a much more doubtful source. Never use a rating of Satisfaction Likely or Satisfaction Possible if the page appears scammy or harmful.

Specific Versus General: Mismatch Between Queries and Landing Pages Sometimes when the query is for a specific product, the landing page is basically on target but much broader or much more specific than the query.

28

If the landing page has the product specified in the query, but a huge number of other products too, this may be a decent experience, but probably is not good enough to get into the Satisfaction Likely range in most cases. If the query is for something general, like [ camera ], the landing page might be extremely specific. For example, a product page for a particular model of camera from a particular manufacturer with a particular set of options. In this case, too, it might appear to be a decent experience, but it probably is not good enough to get rated as Satisfaction Likely in most cases. You may judge that in particular cases, the experience is better or worse than the guidelines above would suggest. For example, if the page has a huge number of different products, but the product in the query is clearly the most prominent and the first thing you see, you might decide it deserves Satisfaction Likely; if it’s so buried in the other products that you don’t even realize it’s there, you might decide it deserves a negative rating. Similarly, if the query looks general and the landing page is for a very specific product, you might think that the product is so obviously the best possible thing to offer for that query that it deserves Satisfaction Likely; on the other hand, if the product is technically in the right category but very very unlikely to be what the user wants (for example, an expensive antique camera requiring glass plates instead of film for the query [ camera ]), you might decide that it deserves a negative rating.

Distinguishing Between Positive and Negative Landing Page Rating Evaluate the query and the Landing Page and weigh the criteria for Satisfaction Likely/Possible against the criteria for Dissatisfaction Possible/Likely. If you find that positive elements and negative elements both seem applicable to the landing page you’re evaluating, ask yourself which side of the positive/negative division seems to be a more reasonable fit and choose a rating on that side.

Distinguishing Between Dissatisfaction Possible and Dissatisfaction Likely Dissatisfaction Likely and Dissatisfaction Possible are negative ratings. If you think that the user will have a negative experience, always use either Dissatisfaction Possible or Dissatisfaction Likely. If you have no particular reason to think a page will interest the user, always use either Dissatisfaction Possible or Dissatisfaction Likely.

Dissatisfaction Possible • •

•

If the page is marginally related to the query and you think that there’s a small chance the user would be interested, use Dissatisfaction Possible. If the page can eventually lead to what the user wants, but only through many clicks or through clicks that lead to an entirely different website, use Dissatisfaction Possible. If the page offers something that you think the user might be interested in, but not what the user was looking for and not especially close to it,

29

•

use Dissatisfaction Possible. For example, if the user is looking for baseball gloves, and the landing page offers athletic socks, there’s probably some chance that the user might be interested. However, it’s not what the user was looking for, and not all that close to it, so it deserves Dissatisfaction Possible. If the page can eventually give the user what he or she is looking for, but the process is protracted and difficult, use Dissatisfaction Possible.

Dissatisfaction Likely • •

•

•

•

•

If the page has nothing to do with the query, use Dissatisfaction Likely. If the query is for a product or service, and neither the product/service nor anything close to it can be purchased from the page, use Dissatisfaction Likely. If the query or a word in the query has two meanings, it is clear which meaning is intended by the user, and the advertiser responds to the wrong meaning, use Dissatisfaction Likely. For example, [ cars 2 ] refers to a movie. A page for a car dealership is clearly a bad landing page for this query, even if it might be a good result for [ car sales ]. If the page looks like a scam, you think users could be harmed by it, or it either attempts to trick the user into downloading something by labeling a download button in a confusing way or tries to download a file without action by the user, use Dissatisfaction Likely. If the page loads but is completely unusable (for example, because some content does not load, or page doesn’t display properly) use Dissatisfaction Likely. If enough of the page does not load at all (for example, you encounter a 404 error), use the Error Did Not Load flag instead of a rating. If the page is very bad for any other reason, use Dissatisfaction Likely.

Query Flags Use these flags to indicate that a query is unrateable. This means that it, and the AC and LP paired with it, are not eligible to be assigned ratings. A Search Ads query is unrateable if it has one of the following problems: • • • •

it is in a language other than the task language (Foreign Language) it is unambiguously pornographic or about sexual services (Porn) it is complete nonsense; research reveals no plausible meaning (Nonsense) it was transcribed incorrectly, using an English rather than Cyrillic keyboard for Russian words (Russian Transcription Error)

If you use one of these flags, all of the later questions will turn gray and don’t need to be answered. Note that these flags are only on the first page of a Search Ads task, the Ad Creative rating section.

Foreign Language Query Use this flag when the query is in a language other than the language of the task. If the query contains words or phrases in another language, but there is enough content in the task’s language that it is understandable, do not use this rating. If the query appears to be in a foreign language, but research reveals that the query term

30

may be commonly used in your rating language or is the name of a specific group, product, or person, do not use this rating. Remember to check the language of the task, especially if you work in multiple languages. If your rating language is English, you rate ads in English for English queries. If you rate in another language, you will rate some tasks in that language and some tasks in English. Your rating language is always designated at the top of the task page. Even if you speak the language of the query, if the task is supposed to be for a different language, use this rating.

Porn Query Use this flag only for queries that are unambiguously for pornographic content or sexual services. Queries for racy or suggestive content, medical information, or art photos generally shouldn’t get this rating. Queries for dating services generally shouldn’t get this rating, unless those dating services depict nudity or specifically identify themselves as sexual rendezvous services.

Nonsense Query Use this flag for queries that are complete nonsense, where research reveals no plausible meaning. As you research, take into consideration that queries that may look like nonsense might actually turn out to be meaningful. The following are examples of queries that do have meaning and should not receive the Nonsense Query flag: • • • • • •

a misspelling a product code or model number technical specifications a partial web address or YouTube video ID a specific username or Twitter handle an uncommon acronym or abbreviation

Don’t assume that a query is nonsensical just because you do not immediately know what it means. Encountering a completely nonsensical query is rare. Most queries mean something, so you should always research the query, even if at first it seems like nonsense. Only use this rating when there is no way for you to reasonably guess about user intent, even after researching the query.

Russian Transcription Error This flag applies only to raters working in Russian. If you are working in a language other than Russian, this flag will never be applicable to your tasks, and you should not use it. If you are working in Russian, you will receive separate instructions for determining when queries should be considered transcription errors. While you will not be able to assign AC, LP, or AC to LP ratings after using one of these Query Flags, you will still need to submit the task for your answers to be recorded. You will submit your responses directly from the first page of the task by clicking the Submit button at the bottom left of the task.

31

Ad Creative Flags If an Ad Creative meets the criteria for using one of the following flags, please use that flag. If criteria are not met for a flag, do not use the flag.

Navigational Bullseye Use the Navigational Bullseye flag when both these things are true: 1. The query appears to be a search for a particular website, section of a website, or web page. 2. The creative looks like it will point to the corresponding website, section of a website, or web page. Not every query is a search for a particular website--in fact, the vast majority are not. The Navigational Bullseye flag should only be used where the frame of reference is similar or compatible between query and creative. For example, with the query, [ ford explorer ], the Navigational Bullseye would be used for creative that appears to take the user to the Ford Explorer section of the Ford website; however, the flag would not be used if the creative appeared to take the user to a different page on the Ford site (a page devoted to the Ford Focus) or a general page on the Ford site (their homepage, for example).

Foreign Language Use this flag when the creative is in a language other than the language of the task. Remember to check the language of the task, especially if you work in multiple languages. Even if you speak the language of the creative, if the task is supposed to be for a different language, use this flag. A creative should be legible in your rating language: if the creative contains words or phrases in another language, but there is enough content in the task’s language that it is understandable, do not use this flag and proceed with the normal creative rating. If you use this flag, some of the later questions will turn gray and don’t need to be answered.

Unexpected Porn Use this flag when both these things are true: 1. The query is not a search for pornographic content or sexual services. If the query has both a pornographic interpretation and a non-pornographic interpretation, assume that the non-pornographic interpretation is the actual user intent. 2. The creative appears to offer pornographic content or sexual services.

32

Use this flag only for unambiguously pornographic content or sexual services. Racy or suggestive content with no nudity, nudity in a medical context, or art photos generally shouldn’t get this flag. Dating services generally shouldn’t get this flag unless they depict nudity or specifically identify themselves as sexual rendezvous services. A regular dating service may deserve a bad rating if it doesn’t match what the query appears to be looking for, but it would not get the flag. If you use this flag, some of the later questions will turn gray and don’t need to be answered.

Used Prior Knowledge In Judging Advertiser Use this flag when knowledge you already had about the advertiser influenced your ratings, either for good or bad. Use this only when your rating is different from what you think you would have given seeing the ad for the first time with no prior experience. If a creative is clearly bad, don’t use the flag even if you already happen to have confirmation that a bad rating is deserved.

Secondary Interpretation of Query Use this flag when the creative text indicates that the advertiser is targeting a clearly secondary interpretation of the query. An interpretation is secondary if it’s reasonable, but there is some other interpretation of the query that you consider much more likely. Don’t use this flag with interpretations that are wrong or unreasonable. Don’t use this flag if you think that the query has multiple, equally likely meanings, and the advertiser is targeting one of those meanings. Do use the flag where the query has multiple, equally likely meanings and the advertiser targets an obscure or less-likely meaning. Please review the main guidelines for instructions on how to select a scale rating when you use this flag.

Landing Page Flags If a Landing Page meets the criteria for using one of the following flags, please use that flag. If criteria are not met for a flag, do not use the flag.

Navigational Bullseye Use the Navigational Bullseye flag when both these things are true: 1. The query appears to be a search for a particular website. 2. The landing page is that site. Not every query is a search for a particular website--in fact, the vast majority are not.

33

Foreign Language Use this flag when the landing page is in a language other than the language of the task, with no obvious way of getting to a version in the language of the task. Remember to check the language of the task, especially if you work in multiple languages. Even if you speak the language of the page, if the task is supposed to be for a different language, use this flag. Don’t use this flag if there is some clear way to get to a version in the target language. For example, if you are rating a Japanese task, a landing page in English with a Japanese flag in the corner pointing to a Japanese version of the site should not get this flag. If you use this flag, some of the later questions will turn gray and don’t need to be answered.

Unexpected Porn Use this flag when both these things are true: 1. The query is not a search for pornographic content or sexual services. If the query has both a pornographic interpretation and a non-pornographic interpretation, assume that the non-pornographic interpretation is the actual user intent. 2. The landing page offers pornographic content or sexual services. Use this flag only for unambiguously pornographic content or sexual services. Racy content with no nudity, nudity in a medical context, or art photos generally shouldn’t get this flag. Dating services generally don’t get this flag unless they depict nudity or specifically identify themselves as sexual rendezvous services. A page with racy content, nudity in an art or medical context, or dating services may deserve a negative rating if it doesn’t match what the query appears to be looking for, but it shouldn’t get the flag. If you use this flag, some of the later questions will turn gray and don’t need to be answered.

Unexpected Download Use this flag when any of the following happens: 1. Clicking on the Visit Landing Page button initiates an attempt to download a file. 2. Some link, button, or graphic on the landing page initiates a download when clicked, but does not clearly indicate that it will do so. For example, a big red button that says “Enter site” or “Check the weather,” but starts a download when clicked, deserves the flag. A similar button that says “Get It Now” or “Click here to download” does not.

34

Never install downloads that a site tries to initiate in this way: it is not part of the rating process. If you use this flag, some of the later questions will turn gray and don’t need to be answered.

Error / Did Not Load Your job when evaluating a search ads task is to evaluate content provided by the advertiser (the ad creative and landing page). Use the Error/Did Not Load (EDNL) flag to indicate that you cannot evaluate the landing page because there is no landing site content provided by the advertiser. There are several reasons why you might not be able to access landing site content provided by the advertiser, including: • the page or site no longer exist • the page or site are under construction • your browser is not able to find or access the page we provided you • your virus/malware protection software blocks you from accessing the site • the landing page opens using a 3rd-party program (such as iTunes) that you do not have installed It’s not always easy to immediately determine if the EDNL flag should be used because different things can happen when a landing page is not working properly. Here are some examples of what you might see when no landing site content is available to evaluate: • a completely blank page • a generic Not Found message generated by your web browser (example: https://www.google-news.com/default.html) • a generic error message generated by the advertiser’s server (example: http://www.centraldopolidor.com.br/enceradeiras.htm) • a generic webpage (often filled with affiliate links) shown by the hosting service in place of the actual landing page (example: http://genealogywise.com/?reqp=1&reqr=) • a search results page shown by your internet service provider because the actual landing page cannot be accessed (example: http://www.dnsrsearch.com/index_results.php?querybox=sdiwfkdis.com &submit=Search) • a notice that the site or page is under construction with no way to access any other part of the landing site (example: http://www.reidknorr.com/demos/vinta_ss/) In all these cases you should use the EDNL flag because you cannot access any content from the actual landing site to evaluate. In the first two examples, above, there is little or no content to evaluate. In the last four examples, there may be content you can see, but it is either not content from the landing page advertiser (e.g. the hosting service, browser, ISP), or the entire advertiser site is inaccessible.

35

Note that a landing page could have an error on it but still have landing site content or a way to access landing site content on the page. Here are some examples of things you might see when there is an error on the page but advertiser content is still available to evaluate: • a page which partially loads • an error saying that the page could not be found, but linking to another part of the landing site • an error stating that the product could not be found, but page provides alternatives or a way to search the landing site for other products • a blank page or an error page that still has site navigation tools (usually on the top or side) • an error page which automatically redirects to and loads a working page on the landing page advertiser site • a landing page which is blocked by a registration form If an advertiser landing page provides enough content to rate, don’t use the EDNL flag. In the cases above, the flag is not used because there is at least some advertiser content on the LP upon which to base your evaluation on. If you use this flag, some of the later questions will turn gray and won’t need to be answered.

Secondary Interpretation of Query Use this flag when the landing page content indicates that the advertiser is targeting a clearly secondary interpretation of the query. An interpretation is secondary if it’s reasonable, but there is some other interpretation of the query that you consider much more likely. Don’t use this flag with interpretations that are wrong or unreasonable. Don’t use this flag if you think that the query has multiple, equally likely meanings, and the advertiser is targeting one of those meanings. Do use the flag where the query has multiple, equally likely meanings and the advertiser targets an obscure or less-likely meaning. Please review the main guidelines for instructions on how to approach the scale rating when you use this flag. If you have questions about this project that are not answered by the instructions above, please review the Rater Hub, which contains additional content about Search Ads rating. If you encounter a technical problem with this rating task, use the “Report A Problem" link in the lower-right hand corner of the rating page.

36

Reporting a Problem As you work on tasks in the Ad Rating System, you may encounter a problem with a specific task, with multiple tasks of the same type, system-wide problems, or other problems. In general, most of the issues within a task you encounter are expected, and the task instructions tell you how to handle it, while unexpected issues or larger issues should generally be reported to administrators. This section helps you determine the right way to handle a problem. When Not To Report A Problem How to Report A Problem with a Task How to Report a General Problem Common Problems Mobile Rating: Non-navigable pages Frequent Task Switching Queries for Illegal Material or Services Task Shortage Technical Issues that Prevent You from Submitting a Task

When Not to Report a Problem Never report a problem with a task if there is a way to address it within the task itself. Always read the task instructions first. They tell you what issues should be flagged within a task rather than reported to administrators. Examples of search ads tasks where you would submit the task rather than report a problem: •

•

•

•

You encounter a query in a foreign language. This task should be submitted. There is a way to handle this situation within the task itself: use the Foreign Language Query flag. You encounter a query for pornographic content. This task should be submitted. There is a way to handle this situation within the task itself: use the Porn Query flag. You encounter an ad for pornographic content along with a nonpornographic or ambiguous query. This task should be submitted. There is a way to handle this situation within the task itself: use the Unexpected Porn flag. A landing page fails to load. This task should be submitted. There is a flag or other option within the task to identify a situation like this: use the Error Did Not Load flag instead of reporting a problem.

As a general rule, if there is a way to tell us about a problem within the rating task (by flagging, for example), don't use "Report a Problem" link described below.

37

How to Report A Problem with a Task We provide an easy way to report a problem within the Ad Rating system itself. Before you report any problem, first review the task instructions, emails that we have sent you, and rater hub documentation. Carefully read the instructions for each task type as many of the problems you encounter are handled within the task itself, by choosing a flag and submitting the task. Please keep in mind that you encounter different task types within the Ad Rating System. Each task has specific instructions that tell you how to evaluate a task. We only want you to use the "Report a Problem" link where there is not a way to address the problem within the task itself. For example, if a task provides a flag to indicate a foreign language, use the flag and submit the task instead of reporting it. Most task types provide a way for you to flag unrateable task elements. In order to report a problem with an ad rating task, click the "Report a Problem" link at the bottom of the ad rating task window. The task will grey out and a Help panel will appear.

38

Here, you can search through the Rater Hub for more information before continuing on to report the problem. If you can’t find a solution to your problem, or if you’ve verified that the problem does need to be reported, click on the Email icon to report the task. The Help panel will expand. (The options provided here may not be identical to the ones you actually see.)

39

40

You will be prompted to provide more detailed information about the problem. A number of common problems are listed here: choose the option that best describes the problem you’re having with the task. Clicking some of these options will lead to further questions, ask you to type in a more detailed description of the problem, or give you instructions for resolving the problem on your own. This Help panel gives you the option to include a screenshot with your problem report. We recommend that leave this box checked, as it may help us assess the problem more effectively. If you include a screenshot, you can use your cursor to highlight the area of the page relevant to the problem you’re reporting. Once you have provided all the necessary information and highlighted any relevant features on your screen, click SEND to submit your problem report. After submitting your report, you will be be taken directly to your next task.

How to Report A General Problem If you need to contact the Ads Eval Administrative Team about a general EWOQ issue rather than a specific task, you can email us at [email protected]. If you email us, be sure to use a descriptive subject line, such as “Frequent Task Switching” or “Problem Accessing EWOQ”.

Common Problems For each of the situations described below, we provide insight to help you determine if reporting a problem is the appropriate course of action to take.

Mobile Rating: Non-navigable Pages It is possible for a mobile landing page to load but links from the landing page will not work. An example is if you can see the landing page, but clicking on any of the links simply refreshes the page, and it is impossible to navigate to other pages. If you encounter such a mobile LP, report it as a problem using the “Report a Problem” link on the rating page.

Frequent Task Switching If the system frequently switches you between task types or languages, and it affects your workflow, do not use the “Report a Problem” link on the task rating page. Instead,email us at [email protected] and use the subject line “Frequent Task Switching”.

41

Queries for Illegal Material or Services Please report any query that is an unambiguous search for something illegal using the “Report a Problem” link on the rating page. Such queries are rare. Administrators will review your submission.

Task Shortage New Raters If you just started your contract and began rating your training set, you will run out of tasks after you have submitted about 50 tasks. We automatically activate your account for regular tasks between 1-4 hours after you run out of training tasks. We recommend that you take a break and try to acquire tasks after waiting a few hours. There is no need to contact us. We require that every task in the training set be submitted successfully before we allow new raters to move on to regular rating. This means that any skipped or dropped tasks are added back to your queue and must be acquired again.You may end up skipping a training task by navigating away from the task page, or by hitting the "Skip this Task" button rather than the "Submit" button. Sometimes this results in new raters seeing a "No tasks are available" message before they're actually done with the training tasks. Once you've gotten the “No tasks are available” message, you can check back every two hours to see if new work is available. You may have only a few tasks to do before you see the message again, but that’s ok. Once you are successfully through training, you will automatically be given access to new tasks. Raters who have Completed Training We are constantly loading more work in the system, and we strive to have work available for acquisition at all times. We are not able to guarantee tasks will always be available for raters, and task shortages do happen sometimes. If you encounter a task shortage, we probably already know about it and are taking action to create more work. If you are no longer in training, please report task shortages by filling out the following form. Before reporting, we encourage you to take a break for a few hours and try again to acquire tasks. If you are still unable to acquire tasks, fill out the form. Do not report task shortages via email.

Technical Issues that Prevent You from Submitting a Task If you experience a technical problem with a task that is unexpected and affects your ability to complete and submit a task, please report it as a problem using the “Report a Problem” link on the rating page. If the issue is recurring and prevents you from working, please mention this in your problem submission.

42

Advanced Topics All ads quality raters begin evaluating the most common type of task: the basic search ad rating task. More experienced raters should expect to see mobile ads and ads with different formats. Raters hired for languages other than English should expect to see tasks in their primary languages as well as English. Finally, there are other task types available for evaluation. The following section discusses tasks beyond the basic search ad rating task. Task Switching Other Task Types LPQ-bucketing Testing and Feedback on New Task Types Mobile Ads Ad Formats Advanced Problem Solving

Task Switching The EWOQ Ad Rating System automatically schedules work for you based both on what is available and on what you are qualified to evaluate. Most raters are qualified to evaluate several task types. When the EWOQ system assigns a new task type to you, this is called Task Switching. The system tries to provide the same task types to raters in 30 minute blocks. You should expect to rate the same task type for about 30 minutes before you are switched to a new task type. If only one task type is available, however, you would continue to work on only that. For raters eligible to rate in English and another language, the system also tries to assign the same language in 30 minute blocks.

Other Task Types A review of other task types you may encounter in the EWOQ Ad Rating System can be found on this page.

Mobile Ads Instructions for using your smartphone to evaluate mobile ads can be found on this page.

Ad Formats Ads come in a variety of formats and have different features. The most common ones you will encounter are described on this page.

Advanced Problem Solving The following section provides some options for solving problems you may encounter using the EWOQ Ad Rating System. If you use the “report a problem” link in EWOQ to report a problem, you may receive a message from administrators asking you to follow the instructions for one or more of the following problem solving tips.

43

Clearing Browser Cache Your browser saves--or caches--web content such as html pages, images, cookies, etc. on your computer to improve performance. If this content gets corrupted, clearing your browser cache can fix some problems. This is the most common solution for a visit landing page button that opens the advertiser landing page in the Ad Rating Task window instead of in a separate browser window or tab. Chrome: From Menu, choose Preferences. In the window, click the Show advanced settings link. Under the Privacy heading is a Clear Browsing Data... button. Click on this button, ensure that the "Cached images and files" checkbox is checked, and click the Clear browsing data button. Firefox: From Menu, choose Preferences. In the pop-up Preferences box, select the Advanced icon. Under "Cached Web Content" is a Clear Now button. After clearing your browser cache, we recommend logging out of all Google services, quitting all browser instances (completely close your browser), log back into the EWOQ system, go to the EWOQ home page, and try acquiring a new task.

Finding and Updating Your Browser Default Language If you evaluate tasks in a language other than English, you might be asked to check your browser default language. Chrome: From Menu, choose Preferences. In the window, click the Show advanced settings link. Under the Languages heading is a Language and input settings button. Click on this button, ensure that language you are rating in is the top language on the list. If that language is not listed, click the Add button, add your rating language, and click the Done button. Firefox: From Menu, choose Preferences. In the pop-up Preferences box, select the Content icon. Under "Languages" is a Choose... button. Default languages are listed in order of preference. Choose the Select a language to add… button and add your rating language if it is not present. Click OK to close the Languages selection box. Click the Close button to close the Firefox Preferences box. Raters will rarely need to modify their default browser language.

Sending Administrators Your Browser and Browser Version Number If you report a problem using the “report a problem” link within the EWOQ system, your browser type and version number is recorded to help us troubleshoot the problem you’re reporting. If you email us about a problem we may ask you to provide this information to help us resolve the issue.

44

Locating Browser Console Information and Sending it to Administrators If you have reported an unusual problem with a task and have reported it to administrators, you may be asked to send browser console information. Administrators generally ask for this information when your browser Chrome: Use the keyboard command, + + , to open up the console. Drag to select everything in the console (or as much as possible). Copy and paste into email, and send to administrator. Firefox: Use the keyboard command, + + , to open up the console. Drag to select everything in the console (or as much as possible). Copy and paste into email, and send to administrator.

45

Rating Mobile Ads Installing and Using a QR Code Reader on Your Smartphone to Rate Mobile Ads Evaluation Tasks Updated December 18, 2014

If you currently have a smartphone, you are eligible to evaluate mobile content by opening landing pages directly on your smartphone. To do so, install a QR code reader app on your phone, and use it to scan codes in EWOQ to open landing pages on your device.

Opting In As you work in EWOQ, you will receive a survey that will allow you to opt in to using QR codes for mobile tasks. If you choose to participate, you will also have to complete a test to confirm that you are able to successfully scan QR codes. If you haven't already opted in to using QR codes and would like to, be on the lookout for future surveys that will allow you to opt in.

QR Codes A QR code is a type of barcode that can be read by an imaging device, such as a camera. Many QR codes contain URLs, and a smartphone can be used as a QR code scanner and direct the phone’s browser to the URL specified in the code.

Where to Get a QR Code Reader For mobile tasks, any QR code reader app will do. You can download one (many for free) from the Google Play Store or iTunes. Here are some examples: For Android: QR Code Reader Barcode Scanner For iOS: QR Code Reader by Scan RedLaser – Barcode Scanner, Shopping Assistant & QR Code Reader You do not have to use one of the apps suggested above. You may pick an app you like or use one you already have installed on your phone.

46

Ad Formats Ad raters will encounter several different types of ads, in both desktop and mobile formats. The following is a list of the most common types of ads. In some cases, ads in the Ads Rating System may appear different than they do in public. Depending on the ad and how it is displayed, we may want you to report it as a problem, or we may ask you to ignore the issue. The section below tells you what to expect and what to do with common ad formats. When rating ads, always read the instructions for the specific type of task you are rating. For example, when evaluating the most common type of ad in the system, Search Ads, you are not expected to click on any part of the ad creative, whether the title or sitelinks (see below). With other task types, you may be asked to do so: the instructions always explain the expectations for a particular task type. Desktop Ads Text Ads Sitelinks Product Listing Ads (PLA) Ads with Specified Locations Seller Rating Ads Combinations of Different Ad Types Mobile Ads Sitelinks Sitelinks with Seller Rating General Guidance for Rating Ads with New Features

Desktop Ads Text Ad The most basic type of ad consists of a title, URL and subtext. The title appears first, in blue font. The URL appears second in green, and the subtext is below the URL in black. An example of this type of ad is shown here.

Sitelinks Another common type of ad will have clickable links to other parts of the advertiser's site. These clickable links are known as "sitelinks" and appear in blue

47

below the ad text. Users can either click the ad title or sitelinks. However, when doing regular search ads rating and you are asked to rate the Ad Creative, you should not click on any part of the ad, including the sitelinks. Only consider the sitelinks by whether you think they would add to the value of the Ad Creative.

Product Listing Ad (PLA) Another type of ad that you will see is a Product Listing Ad (PLA). These ads will show a name, picture and price of a product along with the name of the seller or store. The name of the product will appear in blue, while the price should appear below the name of the product, and the name of the store appears in green below. Normally, these ads appear with the picture above all of the text, as shown here. The query here is [kids detective costume] and a set of PLA ads is shown.

48

However, when ad rating, these ads will usually appear with the picture to the right, as seen below.

When rating PLA ads, you should focus mostly on the text, but can also take the picture into account if you feel it adds substantially, or takes away from an ad. Rate the ad creative taking the full text into account, including the price. While the examples, above, do not look exactly like PLA ads on Google Search Results pages, please treat them as if they are, and do not lower your rating because of these differences. Sometimes the image appears to the left of the text in a PLA. Other times, there could be a technical problem and the image might not appear fully rendered, as shown below.

49

If you do not feel you can determine how an image would affect the quality of an ad because parts of the image are blank, hidden, or missing, you should report the ad using the “Report a Problem” link at the bottom of the rating task. However, if only a small part of the image is blank, hidden, or missing, and you feel you can adequately evaluate the image and how it affects the quality of the ad overall, then do not report it. Instead, rate the ad, and do not give a lower Ad Creative rating due to the display issues of the image.

Ads with Specified Locations These types of ads are similar to normal ads but have the advertiser's physical address and phone number below the main text of the ad.

Seller Rating Ads Seller rating ads are another important ad type that you will encounter. These ads have this title because they include a star rating for the advertiser. The star ratings generally appear below the URL and above the main text as shown here. One thing to keep in mind, when doing performing search ads rating, don't penalize the ad in the Creative to Landing Page rating when you do not encounter actual reviews on the LP.

50

Combinations of Different Ad Types Sometimes you will notice ads that have multiple features. Here are examples of seller rating ads with sitelinks. Here the ratings appear below the URL (as for most seller rating ads), and sitelinks appear below the main ad text as they do for other sitelinks ads.

Sometimes the star ratings will appear to the right of the URL on the same line. This ad also has sitelinks, however they appear on two lines, instead of just one.

Mobile Ads

51

Most of the ads described also appear on the mobile platform. One of the general differences between mobile and desktop ads is that mobile ads appear with a slight background highlight when you encounter them.

Sitelinks Mobile ads can also have sitelinks, as seen below.

Sitelinks with Seller Rating Mobile ads can also have ads with multiple features, like these seller rating and sitelinks ads.

52

General Guidance for Rating Ads with New Features You may encounter ads with other features not described here. The Ads with Additional Features section of the Search Ads Instructions provide general guidance for approaching ads with additional features. You may also contact the ads evaluation administrative team for guidance.

53

Other Task Types Experienced raters should expect to encounter multiple task types as they work in the EWOQ Ad Rating System. Some task types are fairly common and consistent, and you will probably see them multiple times in any given week or month. You may also encounter new, experimental, or unusual task types. Regardless of what enters your rating queue, we always provide detailed instructions about what is expected, and how to evaluate a given task.

Example: Landing Page (bucketing) Here is an example of a common task you may encounter: Landing Page (bucketing). This task type involves examining an advertiser mobile landing page and site. It does not include a user query or an ad creative. Instead of evaluating the level of satisfaction a landing page may or may not provide based on a specific user query, you instead evaluate the advertiser landing page to determine which tag is most applicable based on the provided instructions.

54

Whenever you first encounter a new task type, the full instructions are presented to you. You must always fully review the guidelines for any new task type so you know how to answer the provided questions correctly. In the example above, you will notice the Landing Page URL asks you to visit a mobile landing page. This is done within the regular EWOQ system using the regular web browser of a desktop, laptop, or netbook computer. No mobile device is used. To configure your computer to view mobile advertiser landing pages, visit the rating mobile ads page.

Testing and Feedback on New Task Types When we launch a new task type, we may ask you to help us test it. We generally send raters an email alerting them of the new task type. When we send you a new task type, you may encounter problems, unexpected task behaviors, or other inconsistencies. The feedback you provide via the “report a problem” link helps us improve tasks like this. We may also ask you to fill out a survey about the new task types you evaluate.

55

General Information The General Information section contains information on Work Requirements. You will also find System Information: useful information about browser and device compatibility as well as general system information. There is a glossary of commonly used terms and a Frequently Asked Questions (FAQ) page. Finally, there is a Contact Ads Eval page, that briefly outlines ways of contacting the Ads Eval administrative team. Work Requirements System Information Glossary FAQ Contact Ads Eval

56

Work Requirements Quality and Productivity Standards Continued participation in the Ads Rating Program is contingent upon meeting minimum quality, participation, and productivity standards. The following section explains our expectations in each of these areas. We monitor your work and provide you with feedback. It is your responsibility to meet or exceed minimum quality, productivity, and participation standards.

Accuracy and Quality The accuracy and quality of the work you submit is very important to us. In order to produce high-quality work, you should carefully read the rating instructions, carefully apply those instructions, and review them again as necessary to ensure your answers accurately reflect how the rating instructions tell raters to rate.

Productivity While you are actively working, we expect you to acquire and evaluate new Ads Eval tasks, review existing tasks, and review documentation as needed to perform your duties as an Ads Quality Rater. Productivity is a measure of how efficiently you do these things. Most of your work involves working on new tasks. Because we expect raters to be productive, we review the work you acquire, evaluate, and submit. Because reading documentation is important, we also consider this activity when evaluating your overall work. We do not want you to submit tasks as rapidly as possible: as discussed above, the accuracy and quality of your work are very important. You must balance these key areas.

Productivity expectations We consider the type of work you do, and the introduction of new types of work, when we evaluate your productivity. New raters work primarily on search ad rating tasks, the most common task type. Experienced raters work on other task types as well. When we add new task types to your queue, we factor in time to read the instructions and familiarize yourself with the new work. Keep in mind that instructions for new task types generally build on to the knowledge you have already gained when learning the regular search ad instructions, so mastering new task types should take less time.

Tips for Balancing Quality/Accuracy and Productivity: 1. When you work on Ads Evaluation, devote all of your attention to your work. Concentrating on your work will naturally make you more productive. We expect you to devote your Ads Eval work time solely to Ads Eval. When you focus your attention solely on your work duties, your ratings are likely to be more accurate, which results in fewer mistakes due to oversight. 2. Find the right balance of quality and productivity.

57

Remember, Ads Quality Evaluation is a balance of two key elements - QUALITY and PRODUCTIVITY. Finding the right balance between high-quality, productive, and engaged work is the key to being a successful rater. If you rate too fast, your quality may suffer. If you focus exclusively on quality, you may not complete enough tasks to be productive. Ideally, you must balance these two things together. Everyone is capable of working efficiently, providing high-quality work, and being an engaged rater.

Confidentiality of Work and Documentation The work you do for the Ads Evaluation Program is confidential. As a condition of your employment, you are not permitted to show the EWOQ Ads Evaluation System, work tasks, or instructions to others. Do not work in areas where others might view your work (e.g. a cafe, public library, or hotel lobby).

System Information The EWOQ Ad Rating System is designed to work with desktop, laptop, or netbook computers running a variety of operating systems (e.g. Windows, MacOS, Linux, ChromeOS). The EWOQ system works best with the Google Chrome and Mozilla Firefox browsers. Internet Explorer is not supported. Do not use tablets or mobile phones to access the Ad Rating System. You must use a desktop or laptop computer. You are required to have a high-speed internet connection in order to perform your rating duties.

Browsers The EWOQ system supports Google Chrome and Mozilla Firefox browsers. Internet Explorer is not supported. If you do not have Chrome or Firefox installed on your computer, either application can be downloaded for free using the following links: Chrome: https://www.google.com/intl/en/chrome/browser/ Firefox: http://www.mozilla.org/en-US/firefox/new/

Mobile Devices Mobile devices, such as phones and tablets may not be used to submit ads evaluation tasks at this time. We ask that you do not use them to access the EWOQ Ad Rating System. However, we do ask raters to evaluate mobile ads. This must be done from a desktop, laptop, or netbook computer, but you will use a smartphone to scan QR codes in EWOQ in your computer’s browser to open landing pages in your smartphone’s mobile browser. Instructions for using your smartphone to evaluate mobile ads can be found on the Rater Hub’s Rating Mobile Ads page.

58

Signing Into and Out of the EWOQ System In order to work in the ad rating system, you are required to sign into your Google account using the email address you provided when you started your contract. When you are ready to begin work, log into your Google account, go to the EWOQ Ad Rating Home page, and begin work. Please sign out of the EWOQ Ad Rating System when you are not actively working. Sign out of the EWOQ system by clicking on the Sign Out link in the upper right-hand corner of the EWOQ system. We recommend that you sign out for three reasons: • by signing into the system, you indicate to us that you intend to actively work, and by signing out you indicate that you are no longer working • signing out prevents unauthorized individuals from accessing the confidential EWOQ system using your computer • when you sign out, the EWOQ system releases any work you have not completed and returns that work to the rating queue for other team members to complete.

Task Expired Due to Inactivity If you have an unrated or partially rated task and do not sign out of the EWOQ system, that task can expire if you don’t submit the task in a reasonable amount of time. When a task expires, you lose all work on that task, and the system displays a message in your browser window which states, “This task has expired due to inactivity.” If you see this message, it means the task you were working on was released back to the rating queue for another rater to work on. This message does not affect any work you might have submitted. If you encounter this message frequently and are signing out of the EWOQ system when you finish working, please contact the Ads Eval Administrative Team.

Advertiser Downloads We don’t expect you to download anything from the advertisers you evaluate. In general, your interaction with advertisers is limited to viewing ads and web pages. We specifically ask that you not install anything offered by advertisers. Installation of advertiser apps, toolbars, extensions, add-ons, executables, etc. is not part of your job as an ads quality rater.

Ad Block Software In order to see the ads you evaluate, you must disable any adblock software you may have installed while you are working on this project. You are welcome to enable such software when you are not working on this project.

Antivirus Software, Viruses, and Malware

59

The likelihood you computer may get a virus while working on this project is extremely low. Any viruses and malware found on advertiser sites have usually been taken care of and removed from our system before you ever acquire an evaluation task. Nonetheless, as a precaution, we strongly recommend you have robust, up-to-date antivirus software installed. There are several commercial and free antivirus software packages available. While we cannot recommend a particular package, we do recommend you install a program. You can substantially reduce the possibility of your computer being infected by a virus, malware, adware, or other badware by practicing safe, commonsense browsing practices. And remember that we do not expect you to install anything from the advertisers you are evaluating. Your primary responsibility is to evaluate ads and advertiser landing pages.

60

EWOQ Glossary This glossary contains a comprehensive list of terms from EWOQ and its accompanying documentation.

A ad See ad creative (AC).

administrator (admin) A Google representative that interacts with AQRs by email.

ad creative (AC) The ad creative is an advertisement that appears under the ads heading on the Google search results page. Traditional ad creatives have a title, a body, and a visible URL; however, new ad creatives have additional elements like maps, videos, or extra links. When a user clicks on an ad creative, it takes them to the advertiser's landing page.

Ad Rating System Also known as the EWOQ Ad Rating System. See EWOQ.

[email protected] The email address from which all communication with Google administrators will come.

ads quality rater (AQR) Ads quality raters are responsible for reporting and tracking the visual quality and content accuracy of Google advertisements. AQRs use EWOQ to examine advertising-related data of different kinds and provide feedback and analysis to Google. Projects that ads quality raters work on involve examining and analyzing text, web pages, images, and other kinds of information.

B browser A browser is an application on a computer or mobile device that is used to view webpages. While some browser applications are preinstalled on a computer or mobile device, others must be manually installed by the user. Some browsers are cross-platform, meaning they work on multiple devices. For example, the Google Chrome browser is available on Macintosh, Windows, and Linux computers as well as tablets (iPads and Android) and

61

mobile phones (iPhones and Android). The most common browsers in use today are Chrome, Firefox, Internet Explorer, and Safari.

C contract administrator A company who hires AQRs and is responsible for handling their timesheets, pay, time off requests, and other contract-related issues.

creative See ad creative (AC).

D E EDNL Abbreviation for Error Did Not Load.

EWOQ EWOQ is the ad rating system that ads quality raters (AQRs) use to acquire and complete tasks.

F flag A checkbox on a task page which can be checked to indicate that the content being evaluated meet a specific set of criteria. For example, that it is in a foreign language (FL). Each flag has particular guidelines for its use that can be found in the task instructions of the task for which the flag is used.

foreign language (FL) Any language other than a task’s rating language.

G H I

62

instructions See task instructions.

J K L landing page (LP) When a user clicks on an ad creative, we send the user to the advertiser landing page. This is the first page of the advertiser site that the user sees.

landing site (LS) The website of which the landing page is part.

M N New EWOQ The current ads evaluation system. Ads quality raters (AQRs) acquire tasks and rate them based on the included instructions. As AQRs work in the New EWOQ interface, they acquire tasks as they need them and submit their ratings as they complete them. AQRs perform most or all of their work in New EWOQ.

O operating system An operating system is a fundamental piece of software that manages hardware resources and provides common services for computer programs. Operating systems are an essential part of desktop, laptop, and netbook computers as well as tablets and mobile phones. Operating systems have version names or numbers (or both). The two most common operating systems in use on desktop and laptop computers today are Microsoft Windows, Apple Macintosh OS X. While devices may use older versions of an operating system, the current version of Windows is 8.1, and the current version of Apple OS X, also called Mavericks, is 10.9. For tablets and mobile phones, the two most common operating systems are Google Android and Apple iOS. The current version of Google Android, called KitKat, is 4.4. The current version of Apple iOS is iOS 7. The EWOQ ads

63

evaluation system has minimum operating system requirements which are outlined on the System Requirements page of the Rater Hub.

P Q QR code A QR code is a type of barcode that can be read by an imaging device, such as a camera. Many QR codes contain URLs, and a smartphone camera can be used as a QR code scanner and direct the phone’s browser to the URL specified in the code.

query A query is the set of words, numbers, symbols, or all three that a user types in the search box of a search engine. A user enters a query into the search box of a search engine in order to find something, learn something, or go to a web site or web page.

R radio button A user interface button that allows you to choose only one of a predefined set of options.

rater See ads quality rater (AQR).

rating country Every task has a rating country associated with it. A task should be evaluated from the perspective of a user within the rating country of that task. The rating country and rating language are displayed at the top of a task next to the title. For example, if the rating country for a task is Japan, the task is evaluated from the perspective of a resident of Japan.

rating language Every task has a rating language associated with it. A task should be evaluated from the perspective of a user using the rating language of that task. The rating language and rating country are displayed at the top of a task next to the title. For example, if the rating language for a task is Japanese, the task is evaluated from the perspective of a Japanese speaker.

S

64

secondary interpretation Where an advertiser assumes a meaning that is possible but not very likely, this a secondary interpretation.

slider Sliders are used on many EWOQ tasks as a way for raters to indicate numeric ratings on a continuous scale, usually from -100 to 100. Tasks that use a slider allow raters to enter a rating at any point between these two extremes by moving the marker along the scale. See the FAQ article on the Rater Hub for more information about rating with sliders (“How do I rate using a slider?” and “What if I never seem to use a certain part of the slider?”).

T task The unit of work for an ads quality rater (AQR).

task instructions The task instructions are the instructions for a task that and are presented before each new task type and at the top of each task.

task switching The EWOQ ads evaluation system automatically schedules work for raters based on what is available and what a rater is qualified to evaluate. Task switching refers to the EWOQ system switching you to a different type of task.

U unrateable An unrateable task element has one or more of the following problems, which means that it, and any other task items that relate to it, are not eligible to be assigned ratings: • it is in a language other than the task language (FL) • it is complete nonsense; research reveals no plausible meaning • it is unambiguously pornographic or about sexual services • it is unambiguously about illegal materials or services • it is unreadable; for example, due to a rendering or transcription error • it is unavailable for users in your location and will not load • it is blank or otherwise offers insufficient content to rate

URL The URL is a unique Web address, such as http://www.ford.com. In ads quality evaluation, you will see a URL at the bottom of the ad creative. Each landing page has a unique URL. Occasionally, you may encounter a user query that is a URL like www.ford.com.

65

user intent (UI) A user's intent is what the user is really seeking. Some queries perfectly represent what the user intent is while others are more difficult to decipher. Your first responsibility when evaluating a task is to do your best to determine the user intent behind the query.

V visible URL The visible URL is the web address that appears in an ad creative. It is usually at the bottom of the ad creative, but its location may vary depending of the format of the ad. The landing page URL is usually different from the visible URL. For this reason, a Visit Landing Page button is provided to get users to the landing page to be evaluated.

W X Y Z

66

Frequently Asked Questions (FAQ) The following section contains answers to frequently asked questions. This page has been divided into three sub-sections: please review the relevant sub-section for answers to your questions. If you can't find an answer here, elsewhere in the user guide, or in the task instructions, please see the Contact Ads Eval page for information on how to contact us. Search Ads FAQ User Query FAQ Other/General FAQ

Search Ads FAQ How do I rate a landing page (LP) that fails to load, is completely blank or otherwise has no content from the advertiser? If the landing page fails to load, shows a server error, is completely blank, or otherwise has no content (title, links, images, other sections) pertaining to the advertiser, and you cannot navigate to another page in the advertiser's site, use the Error/Did Not Load (E/DNL) flag. What if I see an error message (such as a 404 error) but there is a title on the page or other advertiser-provided content, or I can navigate to another page within the advertiser's site? In this case, do not use the E/DNL flag. Rate the page with the user in mind. If you think an error message on a landing page (even if you can get to another page within the site) would constitute a negative user experience, rate negatively. The landing page is in a foreign language, but there is a link (such as a flag icon) that takes the user to a page in the correct rating language. Should this landing page get the Foreign Language (FL) flag? No. If you can find a flag icon, or some other link that takes you to a page in the correct rating language, the FL flag is not used. Rate the LP, taking into account the user experience of initially seeing a page in the wrong language and having to navigate from the LP to another page on the advertiser site. How do I rate an ad creative that is cropped or looks unfinished, or is otherwise displaying strangely? In general, you always rate what you see. See the Ad Formats section of the Rater Hub for different ad format examples and how to evaluate them. In some cases, ad creatives display differently in the ad rating system than they do publicly.

67

What if a page doesn't load when I click on the Visit Landing Page button , but I am able to view it by copying and pasting the URL in a new window? You should only evaluate pages that you see after clicking the Visit Landing Page button from within the rating task. Never do any of the following: • Copy and paste the visible URL from the ad creative on the Ad Rating Task page. • Manually change the URL in the window of the landing page. • View pages using a proxy (anonymizer) website. A proxy website is a site that allows user to access websites anonymously. Students and employees sometimes use proxy websites to access websites that are banned from their school or work networks. Copying and pasting the advertiser's URL usually results in you evaluating the wrong landing page, and choosing the wrong ratings. What should I do if the landing page requires registration to view its content? Rate the LP based on the information you can gather about it without registering for the site. You should take into account the fact that the page content is not accessible without registration when determining your rating. For regular search ad rating tasks, you should never provide personal information to or register for a service provided by an advertiser. How do I rate using a slider? Many of our rating tasks involve using a sliding scale to indicate a rating between -100 and 100. You can enter a rating at any point between these two extremes by moving the marker along the scale. In our rating instructions, we often refer to certain ranges on this scale as a shorthand to pick out certain numerical values. For example, when talking about the Landing Page rating for Search Ads tasks, we use Dissatisfaction Likely to refer to the -100 to -50 range on the scale. Likewise, Dissatisfaction Possible refers to -50 to 0; Satisfaction Possible, 0 to 50; and Satisfaction Likely, 50 to 100. However, it’s important that you remember these are ranges, not just categories. We actually want you to rate using the entire scale. If you’re familiar with the American grading system, think of it like assigning a number grade rather than a letter grade-- a 70% and a 79% are both C grades, but the numbers tell you more than the letter grade alone. Not every ad that falls in the same category deserves the exact same score, so you should select a numeric score that reflects where an ad falls within its category; you should not use the same number for every ad in the same category, nor should you randomly select any score in the range. What if I never seem to use a certain part of the slider? We want you to rate using the entire length of the scale. Very bad ads should get ratings at the extreme low end of the scale (ex: -100, -76, -80) and very

68

good ads should get ratings at the extreme high end (ex: 87, 100, 79). If you find that you never use a certain part of the scale-- for example, the -100 to -75 range or the 75 to 100 range-- take a minute to stop and review the relevant part of the task instructions. You probably need to adjust the way you’re assigning ratings. Raters see a lot more ads than the average user. This means that experienced raters can become desensitized to certain kinds of ads-- very good ads and very bad ads have less impact on you over time. This might lead you to use extreme ratings less often than you should. Similarly, as you become familiar with different advertisers and kinds of websites, you might start relying on your past experiences when choosing a rating, rather than the task guidelines. It is important to stay aware of these tendencies and correct for them. Make sure when you’re rating that you can focus and evaluate each ad creative or landing page on full -100 to 100 scale, using your full knowledge of the guidelines. Assess each ad and landing page from the point of view of a real user with a specific goal. Again, very bad ads should get ratings at the extreme low end of the scale and very good ads should get ratings at the extreme high end. It’s important to be intentional about how you use the middle of the scale, too, so that each rating you give is meaningful.

User Query FAQ How do I rate a task that has a query in a foreign language? First, research the query. You may determine the query term may be commonly used in your rating language, or it may refer to a specific business, person, or event. If you determine the query is in a foreign language, you should flag it as unrateable and submit the task. For search ads tasks, use the Foreign Language Query flag. This flag is only present on the first page of a Search Ads task, the Ad Creative rating section. What should count as a Foreign Language query? A query should be legible in your rating language: if the query contains words or phrases in another language, but there is enough content in the task’s language that it is understandable, it is not a Foreign Language query. You must research the query to determine if it should be considered a foreign language query. A query that first seems to be in a foreign language may be a well-established term in the culture of your rating language, or it may be a proper noun. Some examples include business, technology, or economic terms; specific products, events, or businesses; and people, places, or artists. Consider the following query a rater working in English might encounter: [ heroes del silencio ]

69

At first glance, you might be tempted to treat this as a Foreign Language query. Query research indicates it is a query for a music group, and as such is not considered FL because that is the official name of this particular music group. Other examples might be English terms like "email" or "business" for non-English rating tasks, as these can be well-understood and established terms in other languages and cultures. Always assume that the user entering the query is very familiar with the language and culture of the rating language and location. What should I do if a query doesn't make sense? Don’t assume that a query is nonsensical just because you do not immediately know what it means. Encountering a completely nonsensical query is rare. Most queries mean something, so you should always research the query, even if at first it doesn’t make sense to you. As you research, take into consideration that queries that may look like nonsense might actually turn out to be meaningful. The following are examples of queries that do have meaning and should not receive the Nonsense Query flag: • • • • • •

a misspelling a product code or model number technical specifications a partial web address or YouTube video ID a specific username or Twitter handle an uncommon acronym or abbreviation

For example, research reveals that [ 433 a oic ] is a specific IRS tax form number, and [ dMH0bHeiRNg ] is the video ID for a popular YouTube video. That’s why it’s important to investigate the results a query returns, especially if it looks confusing or meaningless at first. When there is no way for you to reasonably guess about user intent, even after researching the query, you should consider that query unrateable. For search ads tasks, use the Nonsense Query flag. This flag is only present on the first page of a Search Ads task, the Ad Creative rating section. What should I do if a query is unambiguously pornographic or looking for sexual services? If you determine the query is unambiguously pornographic or looking for sexual services, you should flag it as unrateable and submit the task. For search ads tasks, use the Porn Query flag. This flag is only present on the first page of a Search Ads task, the Ad Creative rating section. Note that if you are working on a task where porn content is possible or expected, this will be mentioned in the task instructions. Don't flag porn as unrateable if it is expected for a particular task.

70

Other FAQ Can I use my iPad to do my tasks? No. The Ads Evaluation system does not support rating on any kind of tablet, including iPad. What do I do if I am being frequently switched between task types or languages? If you are being switched very frequently, report this as a problem. In general, our system tries to provide you with the same type of task for a minimum of 20-30 minutes. Why can't I submit my task in the Sitelinks task type? Make sure you have visited each tab (pertaining to each sitelink) and answered all the questions for each tab. As with all tasks, you must answer all questions before the submit button becomes active. I am not familiar with the rating country specified at the top of the task. How should I rate the task? You should always rate from the perspective of a user in the specified rating country. If you are not personally familiar with the country, you should do additional research on the task to make sure that you understand it. I am rating Spanish Ads and I saw a query in a language spoken in Spain that is not Castilian Spanish (e.g. Catalan). How should I rate this task? You should have been sent the email: "SPANISH ADS EVAL: How to Approach Multiple Languages in Spain" when you were first added to the Spanish team. It explains how to deal with other languages spoken in Spain when working on Spanish Ads Eval tasks. What is a Russian Transcription Error? This is a classification that is only relevant for raters working in Russian. If you do not work in Russian, please disregard any options relating to Russian transcription errors. If you are rating tasks in Russian, sometimes you will see queries that look like nonsense but could make sense if you were to enter the query on a Cyrillic keyboard instead of a Latin keyboard. For example, the query [ форд ] would be [ ajhl ] if entered on a Latin keyboard, and [ уву загорать в солярии ] would be [ ede pfujhfnm d cjkzhbb ]. We call these transcription errors. You should have been sent the "RUSSIAN ADS EVAL: How to Handle Transcription Errors and Incomprehensible Queries" email when you were first added to the Russian team. It offers more details on transcription errors.

71

I am rating Russian Ads and I saw a query that had a transcription error. How should I rate this task? Queries that have major transcription errors should be considered unrateable. If you encounter one while working on a Search Ads task, you should flag it using the Russian Transcription Error flag.

My smartphone's browser loads a mobile landing page, but I cannot navigate to different parts of the site. What should I do? Please report this problem to admins by clicking the "Report a Problem" link. Click on the email icon at the bottom of the Help panel, then choose the “Other technical problem” option. State the issue you're experiencing in the comment box, then click Send. What should I do if I encounter a 403 Forbidden Error message when trying to access the Ad Rating System? Encountering a 403/Forbidden error or a blank page when trying to access the ads eval system is nearly always caused by logging into multiple Google accounts simultaneously. The ads eval system does not support being logged into any Google account other than the Gmail address you use to rate tasks. You would encounter a 403 error even if you had logged into another Google account but closed the window or tab. Please follow the following steps as they guarantee you will be logged out of all other Google accounts: 1. Click on this link to log out of all Google accounts: https://accounts.google.com/Logout 2. Click on this link to log into the ads evaluation system: https://www.google.com/evaluation/ads/beta/rating/gwt/index.html#ra terhub/subpage=userguide 3. You will be prompted to log in. Please enter the Gmail address you were provided to access the ads rating system along with your password. 4. You should be redirected to the ads evaluation system's Rater Hub page. If this doesn't work, we find that most raters can fix this problem by clearing their browser cache. Here's how: 1. Clear your browser's cache. • Chrome: From Menu, choose Preferences. In the window, click the "Show advanced settings" link. Under the Privacy heading is a Clear Browsing Data... button. Click on this button, ensure that the "Cached images and files" checkbox is checked, and click the Clear browsing data button. • Firefox: From Menu, choose Preferences. In the pop-up Preferences box, select the Advanced icon. Under "Cached Web Content" is a Clear Now button. 2. Log out of all Google services (see step 1 of earlier instructions). 3. Quit all browser instances (completely close the browser).

72

4. Start up the browser, log into your Google account, and visit EWOQ via this link: https://www.google.com/evaluation/ads/beta/rating/gwt/index.html This should allow you to access EWOQ. If these steps don’t work, please make sure that you are using an EWOQsupported browser. EWOQ supports access by Chrome and Mozilla Firefox, but not other browsers such as Safari or Internet Explorer. EWOQ doesn’t support access by mobile devices, including tablets. Using an unsupported device or browser can cause problems accessing EWOQ. If you were using an unsupported device or browser when you tried logging out and/our clearing your cache, please switch to a supported device and browser and try again. If you are using a compatible device and browser, have followed all of the steps outlined above, and still cannot access EWOQ, please contact us at [email protected].

73

Contact Ads Eval The method you use to contact the ads evaluation administrative team will depend on what you need to communicate. The Rater Hub's Report a Problem page contains detailed information on how and when to contact the Ads Eval Administrative Team. Contact options are briefly reviewed, below:

Reporting problems within EWOQ Ad Rating System On every task you rate, there is a Report a Problem link at the bottom of the Ad Rating Task page. Use this link to report a problem. Always review the task instructions before you report a problem in case you are expected to submit a task instead.

Fill out and Submit a Form: Out of Tasks New Raters If you just started your contract and began rating your training set, you will run out of tasks after you have submitted about 50 tasks. We automatically activate your account for regular tasks between 1-4 hours after you run out of training tasks. We recommend that you take a break and try to acquire tasks after waiting a few hours. There is no need to contact us. Raters who have Completed Training We are constantly loading more work in the system for team members to evaluate, and we strive to have work available for acquisition at all times. We are not able to guarantee tasks will always be available for raters, and task shortages do happen sometimes. If you encounter a task shortage, we probably already know about it and are taking action to create more work. You can report task shortages by filling out the following form. Before reporting, we encourage you to take a break for a few hours and try again to acquire tasks. If you are still unable to acquire tasks, fill out the form. Do not report task shortages via email.

Contacting Ads Eval Administrators via Email Because of the volume of emails received, administrators cannot guarantee that every email sent to us will receive a response. Before sending us an email, we expect you to first see if your question can be answered: • in the task instructions for the task you are working on • on the Rater Hub Reporting a Problem page, or elsewhere on the Rater Hub Note: We have designated tools for reporting task shortages and problems with specific tasks. We will not respond to emails on these topics. Please contact us using the provided tools instead. • for task shortages, fill out the form for that purpose • for problems within a task, use the Report a Problem link

74

Email the Ads Eval Administrative Team if you have a general question or to report a problem not associated with a particular task. Send an email to [email protected] from the gmail account you use to evaluate tasks. When sending an email, please follow good email etiquette: • use a short, descriptive subject line related to the issue you are reporting • include all relevant information in the body of your message • if you have a question about a specific task-type, please provide the title (found at the top of the ad rating page when you're looking at a task), or describe it if you don’t know the title. • if appropriate, include your browser and operating system information • if appropriate, include a screenshot of what you are trying to describe In some cases, administrators may ask you for additional information about what you’re experiencing with your browser. The Advanced Topics section provides step-by-step instructions for providing information you may be asked for.

Contacting your Contract administrator As a contract worker, your contract administrator manages and is responsible for all work, timesheet, and billing issues. Please contact them if you have questions about your timesheet, billing, or any other contract-related question. Your contract administrator has provided you with email/phone contact information in the documents they sent you.

75

Ads Evaluation Rater Hub

Short Description

Description

Comments

We need your help!