masanos note

🔧When you want to scrape a SPA site, PhantomJsCloud is solution.

2023-01-02
icon

Overview

Sites created with SPA cannot be scraped as is.
With PhantomJsCloud, you can convert pages made with JS like SPA to HTML and scrape them.

Sample Code using Google AppScript(GAS)

API KEY is copy from PhantomJsCloud.

const URL = 'http://*.com/*'
const KEY = '*';

let payload =
{
  url: URL,
  renderType: 'HTML',
  outputAsJson: true
};
payload = JSON.stringify(payload);
payload = encodeURIComponent(payload);

let fetchUrl = 'https://phantomjscloud.com/api/browser/v2/' + KEY + '/?request=' + payload;
let res = UrlFetchApp.fetch(fetchUrl).getContentText("UTF-8");

PhantomJsCloud

how to get API KEY
Sign up

Email

Sign up Email

get account


can check free credit balance. https://dashboard.phantomjscloud.com/dash.html

ApiKey is KEY

OUT OF CREDITS

An error is returned if the free amount is exceeded.
like this ↓

Exception: Request failed for https://phantomjscloud.com returned code 402. Truncated server response: {"name":"HttpStatusCodeException","message":"OUT OF CREDITS: Your account is out of both Daily Subscription Credits and Prepaid Credits. Either wai... (use muteHttpExceptions option to examine full response)

status code

https://phantomjscloud.com/docs/ > Debugging Page Errors: Status Codes
200OK The target page was captured properly.
400Bad Request Your request had an error in it. Fix it before resubmitting.
401Unauthorized You are using an invalid Api Key. Please check for typos, or create an account.
402Payment Required Your account is out of credits. Login and either upgrade your Subscription or add Prepaid Credits.
403Forbidden Your request was flagged due to abuse. Read the response for steps you should take to resolve the situation.
424Failed Dependency The target page was not reachable (the request timed out). Check and make sure your target URL is valid
429: Too Many Simultaneous Requests You sent a sudden spike of simultaneous requests. PhantomJsCloud can handle hundreds of simultaneous requests, but we require you to gracefully increase the number of concurrent requests over time, not send a sudden spike. Please increase the number of your simultaneous requests according to the schedule shown in the 'Testing and Performance Optimization' section of the docs page. (add +1 simultaneous requests every 3 seconds, or +10 simultaneous every 30 seconds). You may retry this request immediately, with no modifications.
500: Internal Server Error The PhantomJsCloud instance suffered an internal error. You can retry your request immediately, without modifications. If errors still occur, these are the known causes:
More time needed, retry with larger pageRequest.requestSettings.maxWait value.
An incompatible webfont is causing PhantomJs to crash, try blacklisting any font resources (.otf, .ttf, .woff) for example:
pageRequest.requestSettings.resourceModifier:[{regex:'.*ttf.*|.*otf.*|.*woff.*',isBlacklisted:true}]
If you still have problems, please submit your request to Support@PhantomJsCloud for diagnosis.
502: Bad Gateway Your request did not reach PhantomJsCloud due to a network failure. You can retry your request immediately, without modifications. If errors still occur, see the "502 Bad Gateway" Troubleshooting item above.
503: Server Too Busy SERVER TOO BUSY: The serer is temporarily overwhelmed with other requests, and it's request backlog is very large. We are returning this to you to prevent risk of a http timeout occurring instead. You may immediately retry your request. Support@PhantomJsCloud.com has been notified and will investigate. You may retry this request with no modifications.


Related Notessupabase with Vue3 Vue3 with bootstrap-icons[vue3]Install bootstrap5[vue3] install[Nuxt3]The first thing I do when launching a nuxt3 project. (^3.5.2)[Nuxt3][Bootstrap]Use Bootstrap icons with Nuxt3.[Node.js] Storing API results in js🔧[GA4][GTM]Configure GA4 in GTM🔧[Nuxt3]Using Google Tag manager with NUXT3🔧[GA4][BigQuery]Linking GA4 and BigQuery🐛Error brew -v | update-reset🔧Use Google Spreadsheet as API with Nuxt3.🔧 Get json from a spreadsheet using GoogleSheetsAPI v4.🔧[Python]Install Python to Mac[Nuxt3]Install stable version of Nuxt 3.0.0. | npx nuxi init nuxt3-appMake Ranking with MySQLwatch & v-model | Vue3 (Nuxt3)window & document | Nuxt3Using custom domain, Hosting to GitHub Pages with Nuxt3GA4 with Nuxt3📝MySQL - Date Function - Tips Use Nuxt3 props🔧Use MicroCMS with Nuxt3🔧Using GoogleFont with Nuxt3📝Error - Deprecation Warning: $weight: Passing a number without unit % (100) is deprecated. - Bootstrap5 📝using sass with nuxt📝Firebase9 Google Authentication with Nuxt3.📝Set favicon in Nuxt3📝Use bootstrap5 with Nuxt3🐛Error Code: 1290. The MySQL server is running with the --secure-file-priv option so it cannot execute this statement[Nuxt3] How to separate source directoriesmicroCMS & GitHub Actions & Nuxt3Using highlight in Nuxt3.use package.json value🔧frontmatter-markdown-loader & highlight.js🔧Install Font Awesome on Nuxt2 via npm.Github pages with GitHub ActionsCannot find module '~/*/*.vue' or its corresponding type declarations.Vetur(2307)🐛Cannot find module. Consider using '--resolveJsonModule' to import module with '.json' extension.ts(2732)TypeScript Object.🔧Bootstrap5 with Nuxt2processmd with Nuxt2🔧[MySQL]Install MySQL Workbench🔧Convert Markdown to HTML. convert frontmatter to json🔧Install homebrew, nvm, node to Mac🔧[MySQL]Record of installing and starting mysql with homebrew.🎨 Display the photo full screen and overlay the header and footer on top.🔧Set git repository to created project.[Nuxt3] Make Header & Footer
A record of the development is left in a web note.
Masanos
I want to make the world I see happy. Little by little, I am preparing to start a business. Thank you for your support.
Buy Me A Coffee
Copyright© masanos All Rights Reserved.