Skip to content

Conversation

@ayee
Copy link

@ayee ayee commented Nov 2, 2012

Hi Pranab, this is Anthony from Algorithms IO.
In order for the job to run in the Amazon MapReduce, we had to allow the configuration to be loaded from S3. Added the code and thought it would be a good thing to contribute.

@pranab
Copy link
Owner

pranab commented Nov 3, 2012

Anthony

I saw the change. My suggestion will be to add S3 related code to the existing setConfiguration() method as below instead of a separate method. That way the job driver could call just one setConfiguration() method

This how I would like to integrate the change. Let me know if there is any issue. You could send another pull request. Or I could directly change the code

public static void setConfiguration(Configuration conf, String project) throws Exception{
    boolean found = false;
    String confFilePath = conf.get(CONF_FILE_PROP_NAME);

    //user provided config file path
    if (null != confFilePath){
        if (confFilePath.startsWith(S3_PREFIX)) { 
            Matcher matcher = s3pattern.matcher(confFilePath);
            matcher.matches();
            String bucket = matcher.group(1);
            String key = matcher.group(2);
            S3Object object = s3.getObject(new GetObjectRequest(bucket, key));
            is = object.getObjectContent();
            Properties configProps = new Properties();
            configProps.load(is);

            for (Object key : configProps.keySet()){
                String keySt = key.toString();
                conf.set(keySt, configProps.getProperty(keySt));
            }
        } else if (confFilePath.startsWith(HDFS_PREFIX)) {
            loadConfigHdfs( conf,  confFilePath.substring(HDFS_PREFIX_LEN));
            System.out.println("config found in user specified HDFS file");
        } else {
            loadConfig( conf,  confFilePath, false);
            System.out.println("config found in user specified FS  file");
        }
     } else {
        //default file system path
        confFilePath = FS_DEF_CONFIG_DIR + project + PROP_FILE_EXT;
        found = loadConfig( conf,  confFilePath, true);

        //default HDFS path
        if (!found) {
            confFilePath = HDFS_DEF_CONFIG_DIR + project + PROP_FILE_EXT;
            loadConfigHdfs( conf,  confFilePath);
            System.out.println("config found in default HDFS location");
        }  else {
            System.out.println("config found in default FS location");
        }
     }
}

@ayee
Copy link
Author

ayee commented Nov 3, 2012

Pranab, I didn't add the setConfiguration(Configuration) method. That was the one used and ran into an exception because of reco.properties in S3. I noticed setConfiguration(Configuration, String) was similar but since we didn't see errors I didn't change that.
I can still go and make the similar modifications to this method if that's ok by you.

On Nov 2, 2012, at 11:05 PM, Pranab Ghosh notifications@github.com wrote:

Anthony

I saw the change. My suggestion will be to add S3 related code to the existing setConfiguration() method as below instead of a separate method. That way the job driver could call just one setConfiguration() method

This how I would like to integrate the change. Let me know if there is any issue. You could send another pull request. Or I could directly change the code

public static void setConfiguration(Configuration conf, String project) throws Exception{
boolean found = false;
String confFilePath = conf.get(CONF_FILE_PROP_NAME);

//user provided config file path
if (null != confFilePath){
    if (confFilePath.startsWith(S3_PREFIX)) { 
        Matcher matcher = s3pattern.matcher(confFilePath);
        matcher.matches();
        String bucket = matcher.group(1);
        String key = matcher.group(2);
        S3Object object = s3.getObject(new GetObjectRequest(bucket, key));
        is = object.getObjectContent();
        Properties configProps = new Properties();
        configProps.load(is);

        for (Object key : configProps.keySet()){
            String keySt = key.toString();
            conf.set(keySt, configProps.getProperty(keySt));
        }
    } else if (confFilePath.startsWith(HDFS_PREFIX)) {
        loadConfigHdfs( conf,  confFilePath.substring(HDFS_PREFIX_LEN));
        System.out.println("config found in user specified HDFS file");
    } else {
        loadConfig( conf,  confFilePath, false);
        System.out.println("config found in user specified FS  file");
    }
 } else {
    //default file system path
    confFilePath = FS_DEF_CONFIG_DIR + project + PROP_FILE_EXT;
    found = loadConfig( conf,  confFilePath, true);

    //default HDFS path
    if (!found) {
        confFilePath = HDFS_DEF_CONFIG_DIR + project + PROP_FILE_EXT;
        loadConfigHdfs( conf,  confFilePath);
        System.out.println("config found in default HDFS location");
    }  else {
        System.out.println("config found in default FS location");
    }
 }

}

Reply to this email directly or view it on GitHub.

@pranab
Copy link
Owner

pranab commented Nov 4, 2012

Anthony

The method setConfiguration(Configuration) is deprecated. I will mark it so. Please add your code to the other setConfiguration() method as I suggested

@pranab
Copy link
Owner

pranab commented Nov 11, 2012

I have made the changes and based on pull request and my suggestion and committed the changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants