Experience the Lightning Bolt

 

Three months back we announced how we are transforming the shopping experience at eBay, enabling our users to browse with style and speed. Our goal was to provide an engaging experience not only to users who are within the eBay site, but also to mobile users accessing eBay from external platforms like Google and Twitter. This is where AMP technology comes into play. We implemented an AMP version for our new product browse experience, along with the regular mobile web pages, and launched them in June. At that time we did not make our AMP content discoverable to Google, as we had a few pending tasks to be completed. Also, AMP links surfaced in Google search results only for publisher-based content and not for eCommerce.

Things have changed now. Google announced that they are opening AMP beyond the news industry to include eCommerce, travel and so on. From our end, we wrapped up the pending items and linked the AMP pages from non-AMP pages to make them discoverable. Today we are happy to announce that users around the globe will start seeing eBay AMP links in Google search results and experience the lightning bolt — instant loading. We have close to 15 million AMP-based product browse pages, but not all will appear as AMP right away. This feature is being ramped up and will eventually surface. Check out some of the popular queries in a mobile browser — “iPhone 6 no contract” and “canon digital cameras,” for example. The AMP lightning bolt appears next to links as an indication. AMP for eCommerce is now a reality.

eBay AMP Product Browse Page eBay AMP link in Google search results (left); eBay AMP product browse page (right)

Between now and then

Following our initial launch in June, we did a couple of things to make AMP ready for prime time. We outline a few these efforts here.

Robust analytics system

Understanding how users interact with our pages is critical for us to provide the most optimized experience. The back-end system that powers the new product browse experience is designed in such a way that it constantly collects users’ screen activity, learns from it, and optimizes the experience for subsequent visits. For example, if users interact more often with a module that appears below the fold in the screen, then in future visits to the same browse page, that module will start appearing above the fold. Our non-AMP page has a custom analytics library that does the reporting to the back end.

AMP has a component (amp-analytics) for doing this. In our initial AMP launch, we used this component just to track page impressions. It provides a fairly exhaustive tracking mechanism. But what we wanted was more granular control at an element level, where each element dictates what it wants to track. We started working with the AMP team on this and came up with a spec. We went ahead implemented the spec and contributed it back to the open-source project. With the implementation in place, we were able to achieve a robust and advanced analytic system that reports user interactions like click, scroll, and visibility to our backend, which in turn optimizes the subsequent visits.

Feature parity

We mentioned in our previous blog that most of the code is shared between the AMP and non-AMP pages. Even with this code sharing, there were still small feature inconsistencies between the two versions. We closed these gaps, fixed the inconsistencies, and put a process in place to make sure they do not creep in. Having said that, there were certain UI components and behaviors that we were not able to achieve in the AMP version due to restrictions. Some of these components are eCommerce-specific. We are working with the AMP team to add them to the component list so everyone can benefit. A good example would be tabbed UI component, and there is already a feature request to get this implemented.

Streamlined build process

During the initial launch, we put manual effort into managing assets (CSS and JavaScript) between the AMP and non-AMP versions. In the AMP version, there should be no JavaScript, and all CSS should be inline, whereas in the non-AMP version, both CSS and JavaScript should be bundled and externalized. Doing this manually was not ideal. Our asset pipeline tool, Lasso, had a solution for this —  conditional dependencies. We created an AMP flag that gets initialized to true if the request is AMP and then set as a Lasso flag. The pipeline gets access to it and automatically bundles, externalizes, or inlines resources based on the conditions. This was a big time saver and ended up being very efficient.

The road ahead

We are not done yet; in fact, we are just getting started. We have a bunch of tasks lined up.

  • Beyond AMP — We know AMP pages are fast. But what about the subsequent pages the user visits? Currently when users click on a link in the AMP page, a new tab opens, and the destination page is loaded there. In our case, the mobile web version of the destination page is loaded. We want that experience also to be as fast and consistent as the AMP experience. There is an AMP component (amp-install-serviceworker) to achieve this goal, and our top priority is to leverage this utility and create a seamless transition from the AMP to the target page. We are also discussing with the Google team about how to avoid the new tab and continue the experience in the same window.
  • Cache freshness — AMP content is served from Google AMP cache, and the cache update policy can be found here. What this means to eBay is, for popular product queries, users always see fresh content. But for certain extremely rare queries, a few users may end up seeing stale content. While this is not a common scenario, there is an AMP component (amp-fresh) in the works to fix this. We will be integrating this component as soon as it is ready. In the meanwhile, we have a script that we manually run for a few products to update the AMP content in cache.
  • Unified version — Currently we have two versions of the new browse pages — AMP and non-AMP. The AMP version shows up to users searching in Google, and the non-AMP version shows up to users searching within eBay. Although both of them are highly optimized, look the same, and share most of the code, updating both versions is still a maintenance overhead. In addition, we always need to watch out for feature parity. In the future, based on how AMP pages are performing, we may choose to have one mobile version (AMP) and serve it to all platforms.

We are very excited to provide the AMP experience to our mobile users coming from Google. We have been playing with it for a while, and it is indeed lightning fast. Mobile browsing can be slow and sometimes frustrating, and this is where AMP comes in and guarantees a consistent and fast experience. We hope our users benefit from this new technology.

Senthil Padmanabhan | Principal Engineer at eBay

eBay Releases Dynamic Application Security Testing Proxy as Open Source

 

In an effort to contribute to the open-source community for security, Global Information Security (GIS) at eBay released its DAST Proxy as open-source software. DAST Proxy is a life-cycle management tool for dynamic application security scans that has a unique feature set. It is available for download and contribution under the MIT License at https://github.com/eBay/DASTProxy.

What is DAST Proxy?

DAST Proxy has work flows that help users record browser actions and submit them to a backend scan engine, such as AppScan. It updates the user with the scan status and publishes the scan results. It supports automation integration and has a set of RESTful web services that can be seamlessly integrated into any existing Selenium (or any other automation framework) functional test cases for security testing. DAST Proxy also works with all the browser-based test cases for both web and mobile applications.

DAST Architecture

dast-architecture

How Does DAST Proxy work?

This section explains how to conduct a manual dynamic security scan using DAST Proxy.

To start, the user is required to have two browsers installed. On Browser 1, the user obtains a proxy host and port generated by the DAST server on DAST Proxy’s home page. The user then inserts this host and port into Browser 2’s proxy settings. Once the proxy is set up, DAST Proxy records all the web traffic between Browser 2 and the QA server and stores it in a HAR file. The same file is then submitted to the back-end scan engine for thorough dynamic security testing. DAST Proxy polls the back-end engine for the status and resultant vulnerabilities and stores them in the database, which is accessible to the user via the DAST Proxy dashboard.

dast-manual-flow-latest

DAST Proxy features

  • Recording the scan and submitting it to a back-end scan engine, such as AppScan
  • Dashboard with list of scans, vulnerabilities, and payloads
  • Integration with JIRA system
  • Ability to rerun the scans from the dashboard
  • Support for manual API-end point testing with browser plug-ins, such as Postman

Features in the pipeline

  • ZAP (OWASP Zed Attack Proxy project) integration
  • Selenium integration
  • NT OBJECTives integration

A Glimpse into Experimentation Reporting at eBay

 

Around 1500 A/B tests are performed on eBay across different sites and devices on a yearly basis. Experimentation forms a key business process at eBay and plays an important role in the continual improvement of business performance through the optimization of the user experience. Different data insights from these tests enable users to answer important questions such as “How will this new product feature benefit eBay?” or “Does this new page layout improve user engagement and increase GMB?”

Testing allows business units to conceptually explore new ideas with respect to page content and style, search algorithms, product features, etc., which can vary from subtle to radical variations. Such test variations can be easily targeted to segments of the total customer population based on the desired percentage/ramp up and contextual criteria (geographic, system or app-specific), providing a level of assurance before launching to a broader audience.

Experiment Lifecycle

Lifecycle of an experiment at eBay
Lifecycle of an experiment at eBay

All the experiments begin with an idea. The first step is to prepare a test proposal document, which has the summary of what is being tested, why it’s being tested, the amount of traffic assigned, and what action is going to be taken once the results are published. This document will be reviewed and approved in TPS council meetings every week.

The next step is for the test operations team to interact with the product development team to find the right slot for the test schedule, understand the impact of interaction with other tests, and then set up the experiment, assigning necessary traffic to treatment and control, and launch it after smoke testing (a minimal amount of traffic is assigned to make sure everything is working as expected) is successful and necessary validation steps are completed.

The next step is the launch of the experiment. Tracking the experiment will immediately begin, and data is collected. Different reports providing necessary insights will be generated on a daily basis and for the cumulative period through the data collection period. The final results will be published to a wider audience after the experiment is complete. This completes the life cycle of an experiment.

Experimentation reporting

This post will provide a quick overview of the reporting process. Before going further, let’s define some basic terms related to experimentation.

Definitions

  • Guid: A visitor is uniquely identified with a GUID (Global Unique ID). This is a fundamental unit of our traffic that represents the browser on a machine (PC or handheld) visiting the site. It is identified from the cookies that a particular eBay site drops on the user’s browser cache.
  • UserId: A unique ID assigned to each registered user on the site.
  • Event: Every activity of the user captured on the site.
  • Session: All the activity of the user until 30 minutes of inactivity elapses within a day. The aggregate of many events constitute a session.
  • GUID MOD: 100% of the eBay population is divided into 100 different buckets. A Java hash will convert the GUID into a 10-digit hash, and the modulo of the GUID is extracted from this hash, which represents the bucket that the GUID is assigned. A specific GUID will never fall into two different GUID MODs.
  • Treatment and control: The feature to be tested is referred as “treatment,” and “control” is the default behavior.
  • Versions: Any change in the experiment during the active state will create a new version of the experiment. Major and Minor versions are created based on the change’s impact on the experiment.
  • Classifier: A classifier is one of the primary dimensions on which we slice the data and report for different dimensions and metrics under it:

    • Total GUID Inclusive (TGI) — All the GUIDS that are qualified for a particular treatment or control
    • Treated — All the GUIDS that have seen the experience of the treatment
    • Untreated — All the GUIDS that are qualified but have not seen the experience of the treatment

Overview

The following figure shows a simplified view of the reporting process.

Picture1

Upstream data sets

Let us outline the upstream data sets that the process depends on. The data is stored in Hadoop and Teradata systems.

  • User data: Event-level raw data of the user activity on the site, updated every hour.
  • Transaction and activity data: Data sets that capture the metric-level activity of the user such as bid, offer, watch, and many more.
  • Experiment metadata: Metadata tables that provide information about the experiments, treatments, GUID MOD, and various other parameters.

Stage 1

Every day the process first checks for the upstream data sets to be loaded, and stage 1 will be triggered after all the data sets are available. In this stage, detail data sets at the GUID and Session levels are generated from the event level data.

Treatment session: This is one of the primary data set which has GUID and Session-level data at the treatment and version levels. There are various indicators of different dimensions that we will not cover in this post.

Transaction detail data set: All the activity of GUID and Sessions related to transaction metrics such as revenue are captured here. This data set will not have any treatment-level data.

Activity Detail data set: Same as the transaction detail data set but for activity-level metrics such as bid, offer, bin, watch and so on, which are captured here.

There are around six more data sets we generate on a daily basis. We will not go in details about them in this post. All the processing happens on Hadoop, and the data will be copied to Teradata for analysts to access them.

Stage 2 and Outlier Capping

The data sets generated in stage 1 act as upstream data sets for stage 2. In this stage, lots of data transformations and manipulations happen. Data is stored at the GUID, treatment, and dimension levels and stored in Hadoop. This data will not be moved to Teradata, because this stage acts like an intermediate step for our process. Outlier capping is applied to the metrics from the data sets populated from stage 2 to handle extreme values.

Stage 3

The output from the stage 2 is fed into the stage 3, which is the summary process. The data will be aggregated at the treatment, version, and dimension levels, and all the summary statistics are calculated in this step. The data is stored in Hadoop and copied over to Teradata for MicroStrategy to access this information and publish different reports.

Stratification

Post-stratification is an adjustment method in data analysis. It is used to reduce the variance of estimations. In stratification, subjects are randomized to treatment and control at the beginning of the experiment. After data collection, they are stratified according to pre-experiment features, so that subjects are more similar within a stratum than outside that stratum.

The overall treatment effect is then estimated by the weighted average of treatment effects within individual strata. Because the variance of the overall treatment effect estimation consists of variance due to noise and variance due to differences across strata, stratifying experiment subjects removes variance due to strata difference, and thus variance of the estimated overall treatment effect is reduced.

This process runs in parallel and generates stratified transactional metrics. The processing happens in Hadoop, and data is copied over to Teradata for access to different reports.

Scala, Hive, SQL, SAS, R, and MicroStrategy are some of the technologies and statistical packages we use throughout the process. Most of the processing happens in Hadoop, and minor manipulations occur in Teradata.

This concludes the main topic of this post. One of the critical aspects during this process is data quality, as inaccurate results can have impact on the decisions to be made. In the next post, we will talk about different data quality initiatives and how we are tackling them.

Happy Testing!