This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
Hey Guys,
Trying to figure out why my spark.read.format('json').load(path) is failing. I'm using a wildcard path, and anytime I have enough files that the load takes > 120 seconds, it fails with the following error and no other information:
TaskCanceledException: A task was canceled.
I see nothing but successful jobs in the job execution, when I look at the monitor logs, there's no errors at all, its finding the paths just fine and its loading them, when I look into the spark history, no errors, no nothing. Livy reports everything's fine.
But the task was canceled.
I've gone and set my config.txt file up and changed every possible setting that was pointing at 120s to something greater. I'm using:
spark.rpc.message.maxsize 512
spark.rpc.lookupTimeout 100000
spark.scheduler.excludeOnFailure.unschedulableTaskSetTimeout 10000
spark.network.timeout 200000
spark.executor.heartbeatInterval 50000
Post Details
- Posted
- 2 years ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/AZURE/comme...