Why? Because in the new LinuxTaskController, user is changed by the LinuxTaskController at the last minute-- via the call to getRunAsUser().
@Override
public void deleteAsUser(String user, Path dir, Path... baseDirs) {
verifyUsernamePattern(user);
String runAsUser = getRunAsUser(user);
List<String> command = new ArrayList<String>(
Arrays.asList(containerExecutorExe,
runAsUser,
user,
...
String getRunAsUser(String user) {
return UserGroupInformation.isSecurityEnabled() ? user : nonsecureLocalUser;
}
This is a new feature in hadoop 2.3+.
Before, in hadoop 2.2 and below, the user itself was taken at face value. In the new scenario, however, the "getRunAsUser" method actually runs a job as a DIFFERENT user if security is off.
Some more details of how this ultimately manifests in the C code staring a job as 'nobody'.
So, if security is "off", then the above command is called with the user 'nobody'.
The problem is that there is more than one way to secure hadoop :). So... really, its not if "security" is off, but rather, if security is not implemented via the standard hadoop route.
in main.c:
#include "task-controller.h"
...
int ret = set_user(argv[optind]);
in task-controller.c:
set_user:
user_detail = check_user(user);
And when we "check_user", we ultimately throw an error if user='nobody'.
if (user_info->pw_uid < min_uid && !is_whitelisted(user)) {Okay, so what about hadoop 2.2?
fprintf(LOGFILE, "Requested user %s is not whitelisted and has id %d,"
"which is below the minimum allowed %d\n", user, user_info->pw_uid, min_uid);
fflush(LOGFILE);
free(user_info);
return NULL;
}
In Hadoop 2.2, there is no notion of a "getRunAsUser". However, this was a security hole : using YARN-1235 we can write a job that can run as any user .
When did this all change? Specifically in this JIRA: https://issues.apache.org/jira/browse/YARN-1253
So in summary: LinuxTaskControllers will CHANGE your user into a "non-secure" default user if security isn't explicitly enabled in Hadoop 2.3 and beyond !
linux-container-executor.nonsecure-mode.limit-users would have to be set to false for YARN-2424 to allow yarn container tasks to run as the user who submitted them (un-authenticated if Kerberos is not enabled)
ReplyDeleteThe full configuration parameter to set "yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users" in yarn-site.xml
ReplyDeletethanks for the note on YARN-2424 , and also for the yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users notification. I'll look into both.
ReplyDelete