Sagemaker deployment not functional on default example

#7
by jcardenes - opened

I followed the sagemaker deployment steps but it results in prediction 400 error with gpt neox.

I got the same results for 3b and 7b.

Default steps:

from sagemaker.huggingface import HuggingFaceModel
import boto3

iam_client = boto3.client('iam')
role = iam_client.get_role(RoleName='YourRoleName')['Role']['Arn']

hub = {
    'HF_MODEL_ID':'stabilityai/stablelm-base-alpha-7b',
    'HF_TASK':'conversational'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
    transformers_version='4.17.0',
    pytorch_version='1.10.2',
    py_version='py38',
    env=hub,
    role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
    initial_instance_count=1, # number of instances
    instance_type='ml.m5.xlarge' # ec2 instance type
)

predictor.predict({
    'inputs': "Can you please let us know more details about your "
})

Error:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "\u0027gpt_neox\u0027"
}


Sign up or log in to comment