In other words:
When using JsonStorage, your schema should look like this:
{f1: chararray,count: long}
NOT like this:
{f1: NULL,count: long}
The details:
The example of this is below (in green is the change i had to make in this pig script to make JsonStorage export the feilds without exploding on the NULL schema type element).
pigServer.registerQuery(//error happens here !
"id_details = FOREACH csvdata GENERATE " +
"FLATTEN" +
"(STRSPLIT" +
"(ID,',',3)) AS (drop, code, transaction) ," +
"FLATTEN" +
"(STRSPLIT" +
/**
* Schema has to be defined here
* for any feilds which are going to export as json!
*/
"(DETAILS,',',5)) AS (lname, fname, date, price, product:chararray);");
pigServer.registerQuery(
"transactions = FOREACH id_details GENERATE $0 .. ;");
pigServer.registerQuery(
"transactionsG = group transactions by code;");
pigServer.registerQuery(
"uniqcnt = foreach transactionsG {"+
"sym = transactions.product ;"+
"dsym = distinct sym ;"+
"generate flatten(dsym.product) as f1:chararray, COUNT(dsym) as count ;" +
"};");
pigServer.store("uniqcnt", "/tmp/bbb"+System.currentTimeMillis(), "JsonStorage");
ERROR 1031: Incompatable field schema: declared is "f1:chararray", infered is "null::product:NULL"
So the moral of the story : If using JsonStorage, start out using strong types, to save yourself the hassle.
I just filed a JIRA for this https://issues.apache.org/jira/browse/PIG-3627, and we will see how it evolves over time :).
FYI thanks alot to https://issues.apache.org/jira/secure/ViewProfile.jspa?name=cheolsoo for helping me with this !
No comments:
Post a Comment