Using an External Process in Your Nom Nom App

Using an External Process in Your Nom Nom App

This article explains how to leverage an external process from your Nom Nom App.  Specifically it discusses how to use the Scrapy python package.  It is assumed that the reader is already familiar with the information in the Creating Your First Nom Nom App article.

Full Example App

First, install Scrapy.  Then create a new Nom Nom app.
nnd engine-tools create-new scrapy-example
And change the contents of the pkg/executable.py to this code:
import logging
import scrapy

from nomnomdata.engine import Engine
from scrapy.crawler import CrawlerProcess

logger = logging.getLogger("engine.scrapy-example")

class TestSpider(scrapy.Spider):
    name = "test"
    allowed_domains = ["webscraper.io"]
    start_urls = ["https://webscraper.io/test-sites/tables"]

    def parse(self, response):
        logger.info(response.body)

engine = Engine(
    uuid="CHANGE-ME-PLEASE",
    alias="Scrapy Example",
    categories=["general"],
)

@engine.action(
    display_name="Run Scrapy Spider",
    description="",
)
def run_spider(parameters):
    process = CrawlerProcess()
    process.crawl(TestSpider)
    process.start()
The example above does not take in any input parameters.

CrawlerProcess

This utility class allows you to spawn the external process that your Nom Nom App will interact with.  The process will still run inside the context of the Docker container where the rest of your code is running.  The CrawlerProcess class has a method named crawl that takes an object based on scrapy.Spider that contains the details about the URL's that you want to examine.  The class TestSpider represents this object in the sample code above. 

An even more detailed example is available on the Scrapy website.
    • Related Articles

    • Testing Your Nom Nom App

      Local Testing For Nom Nom Apps created in Python, we recommend using pytest to locally test the code that you have written.  Follow this link for a general overview on how to install and use pytest.  In this article, we will discuss a few suggestions ...
    • Reporting and Maintaining Progress in Your Nom Nom App

      This article explains the best practices for reporting Task progress in your Nom Nom App code and for updating a parameter value in a Task from your Nom Nom App code so that you can maintain progress across multiple Task executions.  Specifically it ...
    • Separating UI Code in Your Nom Nom App

      This article explains a best practice for moving different portions of your Nom Nom App code into different files.  Specifically it focuses on moving the portion of the code that controls the user interface to a different file, but the techniques ...
    • Adding Your Nom Nom App to Our Store

      When you deploy a Nom Nom App that you have created, it is immediately available to be used on all of the Nominodes within your Nom Nom Data organization.  However, if you want to share your Nom Nom App with others outside of your organization, ...
    • Adding a Connection to Your Nom Nom App

      In this tutorial, we will discuss how to add a Connection parameter to the template Nom Nom App generated from the SDK and how to use it to connect to Slack.  More information about creating Connections is available in the Managing Connections on a ...