🧠 Second Brain

Search

Search IconIcon to open search

HEY.com: Get a list of ScreenedIn/Out emails

Last updated Feb 9, 2024

If you are here, how the workflow works, there will be an upcoming note HEY-Screener in (Neo)Mutt, but for now, you can check all scripts on my Mutt dotfiles.

# Grabbing emails from ScreenedIn and Out from the current screener page

This is the Screener HEY URL: https://app.hey.com/my/clearances?page=3 we want to scrape from. The tag to grab is screened-person--denied and screened-person--approved.

# Automated: With open API, including pagination (Python)

This is the second option I created after the Console one didn’t scale and only for one page.

Then I tried to find an open API (see further below how you can find the API). As I found one, I used Python to loop through all pages with the “older” button, and then do the same again:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
import requests
from bs4 import BeautifulSoup
import os


def scrape_emails(url, cookies):
    page = 1
    denied_emails = []
    approved_emails = []

    with requests.Session() as session:
        while True:
            response = session.get(url, params={"page": page}, cookies=cookies)
            soup = BeautifulSoup(response.text, "html.parser")

            # Extract emails
            for element in soup.select(".screened-person--denied"):
                email = element.select_one(".screened-person__details span")
                if email:
                    denied_emails.append(email.get_text(strip=True))

            for element in soup.select(".screened-person--approved"):
                email = element.select_one(".screened-person__details span")
                if email:
                    approved_emails.append(email.get_text(strip=True))

            # Check for the 'Older' button/link
            next_page_link = soup.select_one(
                'a.paginator__next[href*="/my/clearances?page="]'
            )
            if not next_page_link:
                break  # No more pages

            page += 1
            # if page == 3:
            #     break

    return denied_emails, approved_emails


def write_to_file(filename, email_list):
    with open(filename, "w") as file:
        for email in email_list:
            file.write(f"{email}\n")


cookies = {
    # Set ENV variable with hey cookie. Load the screener and search in network tab for `https://app.hey.com/my/clearances?page=` request.
    # There you see the cookies used. Might need to change after re-login
    "_csrf_token": os.getenv("HEY_COOKIE"),
}


url = "https://app.hey.com/my/clearances"
denied_emails, approved_emails = scrape_emails(url, cookies)

# Write the lists to files
write_to_file("denied_emails.txt", denied_emails)
write_to_file("approved_emails.txt", approved_emails)

print("Denied Emails:", denied_emails)
print("Approved Emails:", approved_emails)

See the latest version on GitHub.

Make sure to set the ENV cookie. You can achieve that by loading the screener and searching in the network tab for https://app.hey.com/my/clearances?page= request.

There you see the cookies used. Might need to change after re-login.
See below:

# Manually per page: Console (JavaScript)

Use the Console in the Developer mode of the browser and extract all ScreenedIn/Out from the current page:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
const extractEmails = (className) => {
    const emails = [];
    document.querySelectorAll(`.${className}`).forEach(element => {
        const emailElement = element.querySelector('.screened-person__details');
        if (emailElement) {
            const emailText = emailElement.textContent.trim();
            const emailMatch = emailText.match(/\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/);
            if (emailMatch) {
                emails.push(emailMatch[0]);
            }
        }
    });
    return emails;
};

const deniedEmails = extractEmails('screened-person--denied');
const approvedEmails = extractEmails('screened-person--approved');

console.log('Denied Emails:', deniedEmails);
console.log('Approved Emails:', approvedEmails);

Origin: HEY-Screener in (Neo)Mutt
References: Getting the Data – Scraping
Created 2023-11-21