Community
Answers Developer Questions
Questions
How do I obtain the filename associated with a page's outgoing link

How do I obtain the filename associated with a page's outgoing link

I have images in pages that are links to image attachments in other pages:

<ac:image>
    <ri:attachment ri:filename="home.jpg">
        <ri:page ri:content-title="Let's edit this page (step 3 of 9)" ri:space-key="ds" />
    </ri:attachment>
</ac:image>

I can obtain the linked page's space (ds) and title (Let's edit this page (step 3 of 9)) from the OutgoingLink object but not the filename.

How can I obtain the filename (home.jpg) so that I can access and download the linked attachment?

3 answers

1 accepted

Comments for this post are closed

Community moderators have prevented the ability to post new answers.

Post a new question

0 votes

Answer accepted

The easy way but requires jsoup

Page p=&lt;Load the page here&gt;;
final ConversionContext conversionContext = new DefaultConversionContext(page.toPageContext());
String rendered = renderer.render(post.getBodyAsString(),conversionContext);
Document doc = Jsoup.parse(rendered); 
Elements images = doc.select("img");
for (Element el : images) {
    String imageUrl = el.attr("src");
	//Do something with image url
}

The harder way:

Pate p=&lt;Load the page here&gt;;
Pattern p1=Pattern.compile("(&lt;ac:image\\s*(?:[ac:height=\"[0-9]*\"]*)?&gt;.*?&lt;/ac:image&gt;)");

Matcher m=p1.matcher(p.getBodyAsString());
if(m.find()){
	Document doc=loadXml(p.getBodyAsString());
	do{
		NodeList img = doc.getElementsByTagName("ac:image");
		Element element=(Element)img.item(0);
		//According to probable scenarios (1)

        //Do like:
		//element.getAttribute("ri:filename");
		//element.getAttribute("ri:space-key");
		//element.getAttribute("ri:content-title");
		//element.getAttribute("ri:value");
	}while(m.find());
}

private Document loadXMLFromString(String xml) throws Exception {
  DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
  DocumentBuilder builder = factory.newDocumentBuilder();
  InputSource is = new InputSource(new StringReader(xml));
  is.setEncoding("UTF-8");
  return builder.parse(is);
 }

where (1) is https://confluence.atlassian.com/doc/confluence-storage-format-283640220.html

Hope it helps

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Thanks for the information it was very useful to see a solution based on the page data rather than page related objects. I went for a combination of the two solutions that you provided: Document doc = Jsoup.parse(page.getBodyAsString(), "", Parser.xmlParser()); Elements images = doc.getElementsByTag("ac:image"); for (Element image : images) { Element attachment = image.getElementsByTag("ri:attachment").first(); if (attachment != null) { String fn = attachment.attr("ri:filename"); Element sourcePage = attachment.getElementsByTag("ri:page").first(); if (sourcePage != null) { String ct = sourcePage.attr("ri:content-title"); String sk = sourcePage.attr("ri:space-key"); } } }

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

I am glad it helped :)

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

0 votes

Hi Panos, thanks for your prompt reply.

I'm working on a java plugin which is downloading the images displayed in a page.

I can access the images directly attached to the page using attachmentManager.getLatestVersionsOfAttachments(page);

However, some images are displayed using links to attachments contained on other pages and to access these I was using page.getOutgoingLinks();

The problem is that the OutgoingLink object provides the space and title information for the referred page, but not the attachment filename.

It's the filename that I need and I cannot see anyway of obtaining it other than using some RegEx on the page content.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

If i was you, i would follow the outgoing links and parse the ac:image macro. It happens to be working on parsing some similar situation, so if you decide to go that way i can help you further.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Thanks for the information, yes I would like to see how you are parsing the ac:image tag. I am using the XhtmlContent class but this does not see ac:image as a macro.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

0 votes

This is relative format, so the home.jpg is attached to that page this macro is located.

Since you don't explain how you are obtaining - java? javascript? something else? - I enlist some of options:

1) Rest call to find the url:

Make a call to /rest/prototype/1/content/PAGEID_HERE/attachment, find the attachment with filename="home.jpg", extract field "link"->"href" from the json.

2) Use backend:

Page p=pageManager.getPage(PAGEID_HERE);
Attachment attachment = attachmentManager.getAttachment(p, "home.jpg");
attachment.getUrlPath();

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Was this helpful?

Thanks!

Answers Developer Questions

Products

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Get product advice from experts

Join a community group

Advance your career with learning paths

Earn badges and rewards

Connect and share ideas at events

How do I obtain the filename associated with a page's outgoing link

3 answers

1 accepted

Comments for this post are closed

Was this helpful?

Thanks!

TAGS

Atlassian Community Events

Ask a question

Start a discussion

Products

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Get product advice from experts

Join a community group

Advance your career with learning paths

Earn badges and rewards

Connect and share ideas at events

How do I obtain the filename associated with a page's outgoing link

3 answers

1 accepted

Comments for this post are closed

Was this helpful?

Thanks!

TAGS

Atlassian Community Events